External Debugging Tools 4: Debug the JVM Using jhsdb

External Debugging Tools 4: Debug the JVM Using jhsdb

Need to diagnose an app, possibly in production? Weird crashes? Gc tuning? Core dump from the JVM? jhsdb is the tool you've been searching for!

Java 9 was all about modules. It was the big change and also the most problematic change. It sucked the air out of every other feature that shipped as part of that release. Unfortunately, one victim of this vacuum was jhsdb which is a complex tool to begin with. This left this amazingly powerful tool in relative obscurity. That’s a shame.

So what is jhsdb? Oracles documentation defines it as:

“jhsdb is a Serviceability Agent (SA) tool. Serviceability Agent (SA) is a JDK component used to provide snapshot debugging, performance analysis and to get an in-depth understanding of the Hotspot JVM and the Java application executed by the Hotspot JVM.”

This doesn’t really say much about what it is and what it can do. Here’s my simplified take: it’s a tool to debug the JVM itself, and understand core dumps. It unifies multiple simpler tools to get deep insight into the native JVM internals. You can debug JVM failures and native library failures with it.

Basics

To get started we can run:

jhsdb --help

This produces the following output:

clhsdb           command line debugger
hsdb             ui debugger
debugd --help    to get more information
jstack --help    to get more information
jmap   --help    to get more information
jinfo  --help    to get more information
jsnap  --help    to get more information

The gist of this is that jhsdb is really 6 different tools:

  • debugd - acts as a remote debug server we can connect to
  • jstack - stack and lock information
  • jmap - heap memory information
  • jinfo - basic JVM information
  • jsnap - performance information
  • Command Line Debugger - I won’t discuss that since I prefer the GUI
  • GUI Debugging tool

debugd

This isn’t as useful for most cases as you probably don’t want to run this in production. However, if you’re debugging a container locally and need to get information from within the container it might make sense to use debugd. Unfortunately, because of a bug in the UI you can’t currently connect to a remote server via the GUI debugger. I could only use this with command-line tools such as jstack (discussed below).

With the command:

jhsdb debugd --pid 72640

I can connect to a process ID and expose it externally. I can then use the command:

jhsdb jstack --connect localhost

To connect to the server with the jstack operation. Notice that the --connect argument applies globally and should work for all commands.

jstack

Normally we would run the command directly with the process ID instead of going through a server connection. This command produces a thread dump that you might find very familiar and yet, helpful. If you’re looking at a stack process in a user machine or production environment, this is an invaluable tool. Just run:

jhsdb jstack --pid 72640

Then review the status of the JVM… Here are a few lines of the output:

Attaching to process ID 72640, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 11.0.13+8-LTS
Deadlock Detection:

No deadlocks found.

"Keep-Alive-Timer" #189 daemon prio=8 tid=0x000000011d81f000 nid=0x881f waiting on condition [0x0000000172442000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
   JavaThread state: _thread_blocked
 - java.lang.Thread.sleep(long) @bci=0 (Interpreted frame)
 - sun.net.www.http.KeepAliveCache.run() @bci=3, line=168 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=829 (Interpreted frame)
 - jdk.internal.misc.InnocuousThread.run() @bci=20, line=134 (Interpreted frame)


"DestroyJavaVM" #171 prio=5 tid=0x000000011f809000 nid=0x2703 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   JavaThread state: _thread_blocked

Notice a few things of interest:

  • We get information about the JVM running
  • It detects deadlocks automatically for us!
  • All threads are printed with full stack and compilation status

This snapshot can help us infer many details about how the application acts in production. Is our code compiled? Is it waiting on a monitor? What other threads are running and what are they doing.

jmap

If you want to understand what’s happening under the hood in RAM, then this is the command you need to use. It prints everything you need to know about the heap, memory, etc. Here’s a simple use case:

jhsdb jmap --pid 72640 --heap

Which produces this output:

Attaching to process ID 72640, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 11.0.13+8-LTS

using thread-local object allocation.
Garbage-First (G1) GC with 9 thread(s)

Heap Configuration:
   MinHeapFreeRatio         = 40
   MaxHeapFreeRatio         = 70
   MaxHeapSize              = 17179869184 (16384.0MB)
   NewSize                  = 1363144 (1.2999954223632812MB)
   MaxNewSize               = 10305404928 (9828.0MB)
   OldSize                  = 5452592 (5.1999969482421875MB)
   NewRatio                 = 2
   SurvivorRatio            = 8
   MetaspaceSize            = 21807104 (20.796875MB)
   CompressedClassSpaceSize = 1073741824 (1024.0MB)
   MaxMetaspaceSize         = 17592186044415 MB
   G1HeapRegionSize         = 4194304 (4.0MB)

Heap Usage:
G1 Heap:
   regions  = 4096
   capacity = 17179869184 (16384.0MB)
   used     = 323663048 (308.6691360473633MB)
   free     = 16856206136 (16075.330863952637MB)
   1.8839668948203325% used
G1 Young Generation:
Eden Space:
   regions  = 66
   capacity = 780140544 (744.0MB)
   used     = 276824064 (264.0MB)
   free     = 503316480 (480.0MB)
   35.483870967741936% used
Survivor Space:
   regions  = 8
   capacity = 33554432 (32.0MB)
   used     = 33554432 (32.0MB)
   free     = 0 (0.0MB)
   100.0% used
G1 Old Generation:
   regions  = 4
   capacity = 478150656 (456.0MB)
   used     = 13284552 (12.669136047363281MB)
   free     = 464866104 (443.3308639526367MB)
   2.7783193086322986% used

When you tune GC flags to get better performance and reduce GC thrashing (especially when collecting very large heaps). The values you see here can give you the right hints on whether something is used. The true benefit is the ability to do this on a “real process” without running in a debug environment. This way you can tune your GC in production based on real environment conditions.

If you could reproduce a memory leak but you don’t have a debugger attached, you can use:

jhsdb jmap --pid 72640 --histo

This prints a histogram of the RAM, snipped below for clarity. You can find the object that might be the trigger of this leak:

Attaching to process ID 72640, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 11.0.13+8-LTS
Iterating over heap. This may take a while...
Object Histogram:

num       #instances    #bytes    Class description
--------------------------------------------------------------------------
1:        225689    204096416    int[]
2:        485992    59393024    byte[]
3:        17221    23558328    sun.security.ssl.CipherSuite[]
4:        341376    10924032    java.util.HashMap$Node
5:        117706    9549752    java.util.HashMap$Node[]
6:        306720    7361280    java.lang.String
7:        12718    6713944    char[]
8:        113884    5466432    java.util.HashMap
9:        64683    4657176    java.util.regex.Matcher
10:        95612    4615720    java.lang.Object[]
11:        106233    4249320    java.util.HashMap$KeyIterator
12:        16166    4090488    long[]
13:        126977    4063264    java.util.concurrent.ConcurrentHashMap$Node
14:        150789    3618936    java.util.ArrayList
15:        130167    3546016    java.lang.String[]
16:        156237    3227152    java.lang.Class[]
17:        33145    2916760    java.lang.reflect.Method
18:        32193    2575440    nonapi.io.github.classgraph.fastzipfilereader.FastZipEntry
19:        17314    2051672    java.lang.Class
20:        32043    1794408    io.github.classgraph.ClasspathElementZip$1
21:        107918    1726688    java.util.HashSet
22:        105970    1695520    java.util.HashMap$KeySet

There are a few other capabilities in the jmap command but they aren’t as useful as those two for day to day work.

jinfo

Prints the system properties and the JVM flags. This isn’t as useful as the other commands since we probably know those already. However, if you’re debugging on a machine that isn’t yours this can be a helpful starting point:

jhsdb jinfo --pid 72640

jsnap

jsnap prints a lot of internal metrics and statistics including the number of threads started since JVM launch, peak number of threads, etc. These are very useful if you want to tune elements such as thread pool sizes, etc.

jhsdb jsnap --pid 72640

This is the snipped version of the output for such a command:

Attaching to process ID 72640, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 11.0.13+8-LTS
java.threads.started=418 event(s)
java.threads.live=12
java.threads.livePeak=30
java.threads.daemon=8
java.cls.loadedClasses=16108 event(s)
java.cls.unloadedClasses=0 event(s)
java.cls.sharedLoadedClasses=0 event(s)
java.cls.sharedUnloadedClasses=0 event(s)
java.ci.totalTime=23090159603 tick(s)
java.property.java.vm.specification.version=11
java.property.java.vm.specification.name=Java Virtual Machine Specification
java.property.java.vm.specification.vendor=Oracle Corporation
java.property.java.vm.version=11.0.13+8-LTS
java.property.java.vm.name=OpenJDK 64-Bit Server VM
java.property.java.vm.vendor=Azul Systems, Inc.
java.property.java.vm.info=mixed mode
java.property.jdk.debug=release

GUI Debugger

I’ll skip the CLI debugger. You can launch the GUI debugger with no argument and use the file menu to connect to a core file, server or process ID. Or you can use the standard command line that works for every other command. For the command line case we can just use:

jhsdb hsdb --pid 72640

Which launches the debug UI:

image02-hsdb-gui.png

The amount of capabilities and information exposed in this tool is absolutely fantastic. A lot of it might be too low level if we’re debugging from the Java side. But if you’re working with JNI native code, you can see the addresses matching the JVM thread/methods in the stack. This can put in context some native code debugging where the symbols might not be exposed coherently.

You can get the stack information for a specific thread (similar to jstack). The inspector tool contains deep internal JVM state information per thread that can provide internal status. Most of that information isn’t as useful for debugging typical application level bugs but if you’re trying to understand the root cause of a JVM crash it might come in handy. Especially if you’re somewhat familiar with VM internal monikers.

There are many other features in the “Tools” menu such as Class Browser, Code Viewer, Memory viewer, etc. All of those are very low level access tools that might come in handy for tracking crashes or niche edge case issues.

Finally

jhsdb is a remarkably powerful and useful tool that I hope you will add into your toolbox. Its biggest fault is that it’s too powerful and too low level. As a result, most Java developers might find it intimidating. This is fair when dealing with some of its more esoteric features that only make sense when interacting with native code.

However, jhsdb also includes many high-level capabilities from deadlock detection to JVM information. Those are easily accessible and easy to use. They’re performant enough for production environments and since they’re a part of the JDK, they should be available out of the box and in the command line. When you need to explore a JVM crash, often after the fact where all you have is a core file… This is the best tool for the job. It’s also a pretty great tool if you need to tune your GC, thread pools or JVM configurations. If you’re debugging JNI code and misbehavior, this can be the tool that makes a difference.