External Debugging Tools 4: Debug the JVM Using jhsdb
Need to diagnose an app, possibly in production? Weird crashes? Gc tuning? Core dump from the JVM? jhsdb is the tool you've been searching for!
8 min read
Java 9 was all about modules. It was the big change and also the most problematic change. It sucked the air out of every other feature that shipped as part of that release. Unfortunately, one victim of this vacuum was
jhsdb which is a complex tool to begin with. This left this amazingly powerful tool in relative obscurity. That’s a shame.
So what is
jhsdb? Oracles documentation defines it as:
“jhsdb is a Serviceability Agent (SA) tool. Serviceability Agent (SA) is a JDK component used to provide snapshot debugging, performance analysis and to get an in-depth understanding of the Hotspot JVM and the Java application executed by the Hotspot JVM.”
This doesn’t really say much about what it is and what it can do. Here’s my simplified take: it’s a tool to debug the JVM itself, and understand core dumps. It unifies multiple simpler tools to get deep insight into the native JVM internals. You can debug JVM failures and native library failures with it.
To get started we can run:
This produces the following output:
clhsdb command line debugger hsdb ui debugger debugd --help to get more information jstack --help to get more information jmap --help to get more information jinfo --help to get more information jsnap --help to get more information
The gist of this is that jhsdb is really 6 different tools:
- debugd - acts as a remote debug server we can connect to
- jstack - stack and lock information
- jmap - heap memory information
- jinfo - basic JVM information
- jsnap - performance information
- Command Line Debugger - I won’t discuss that since I prefer the GUI
- GUI Debugging tool
This isn’t as useful for most cases as you probably don’t want to run this in production. However, if you’re debugging a container locally and need to get information from within the container it might make sense to use
debugd. Unfortunately, because of a bug in the UI you can’t currently connect to a remote server via the GUI debugger. I could only use this with command-line tools such as
jstack (discussed below).
With the command:
jhsdb debugd --pid 72640
I can connect to a process ID and expose it externally. I can then use the command:
jhsdb jstack --connect localhost
To connect to the server with the
jstack operation. Notice that the
--connect argument applies globally and should work for all commands.
Normally we would run the command directly with the process ID instead of going through a server connection. This command produces a thread dump that you might find very familiar and yet, helpful. If you’re looking at a stack process in a user machine or production environment, this is an invaluable tool. Just run:
jhsdb jstack --pid 72640
Then review the status of the JVM… Here are a few lines of the output:
Attaching to process ID 72640, please wait... Debugger attached successfully. Server compiler detected. JVM version is 11.0.13+8-LTS Deadlock Detection: No deadlocks found. "Keep-Alive-Timer" #189 daemon prio=8 tid=0x000000011d81f000 nid=0x881f waiting on condition [0x0000000172442000] java.lang.Thread.State: TIMED_WAITING (sleeping) JavaThread state: _thread_blocked - java.lang.Thread.sleep(long) @bci=0 (Interpreted frame) - sun.net.www.http.KeepAliveCache.run() @bci=3, line=168 (Interpreted frame) - java.lang.Thread.run() @bci=11, line=829 (Interpreted frame) - jdk.internal.misc.InnocuousThread.run() @bci=20, line=134 (Interpreted frame) "DestroyJavaVM" #171 prio=5 tid=0x000000011f809000 nid=0x2703 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE JavaThread state: _thread_blocked
Notice a few things of interest:
- We get information about the JVM running
- It detects deadlocks automatically for us!
- All threads are printed with full stack and compilation status
This snapshot can help us infer many details about how the application acts in production. Is our code compiled? Is it waiting on a monitor? What other threads are running and what are they doing.
If you want to understand what’s happening under the hood in RAM, then this is the command you need to use. It prints everything you need to know about the heap, memory, etc. Here’s a simple use case:
jhsdb jmap --pid 72640 --heap
Which produces this output:
Attaching to process ID 72640, please wait... Debugger attached successfully. Server compiler detected. JVM version is 11.0.13+8-LTS using thread-local object allocation. Garbage-First (G1) GC with 9 thread(s) Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 17179869184 (16384.0MB) NewSize = 1363144 (1.2999954223632812MB) MaxNewSize = 10305404928 (9828.0MB) OldSize = 5452592 (5.1999969482421875MB) NewRatio = 2 SurvivorRatio = 8 MetaspaceSize = 21807104 (20.796875MB) CompressedClassSpaceSize = 1073741824 (1024.0MB) MaxMetaspaceSize = 17592186044415 MB G1HeapRegionSize = 4194304 (4.0MB) Heap Usage: G1 Heap: regions = 4096 capacity = 17179869184 (16384.0MB) used = 323663048 (308.6691360473633MB) free = 16856206136 (16075.330863952637MB) 1.8839668948203325% used G1 Young Generation: Eden Space: regions = 66 capacity = 780140544 (744.0MB) used = 276824064 (264.0MB) free = 503316480 (480.0MB) 35.483870967741936% used Survivor Space: regions = 8 capacity = 33554432 (32.0MB) used = 33554432 (32.0MB) free = 0 (0.0MB) 100.0% used G1 Old Generation: regions = 4 capacity = 478150656 (456.0MB) used = 13284552 (12.669136047363281MB) free = 464866104 (443.3308639526367MB) 2.7783193086322986% used
When you tune GC flags to get better performance and reduce GC thrashing (especially when collecting very large heaps). The values you see here can give you the right hints on whether something is used. The true benefit is the ability to do this on a “real process” without running in a debug environment. This way you can tune your GC in production based on real environment conditions.
If you could reproduce a memory leak but you don’t have a debugger attached, you can use:
jhsdb jmap --pid 72640 --histo
This prints a histogram of the RAM, snipped below for clarity. You can find the object that might be the trigger of this leak:
Attaching to process ID 72640, please wait... Debugger attached successfully. Server compiler detected. JVM version is 11.0.13+8-LTS Iterating over heap. This may take a while... Object Histogram: num #instances #bytes Class description -------------------------------------------------------------------------- 1: 225689 204096416 int 2: 485992 59393024 byte 3: 17221 23558328 sun.security.ssl.CipherSuite 4: 341376 10924032 java.util.HashMap$Node 5: 117706 9549752 java.util.HashMap$Node 6: 306720 7361280 java.lang.String 7: 12718 6713944 char 8: 113884 5466432 java.util.HashMap 9: 64683 4657176 java.util.regex.Matcher 10: 95612 4615720 java.lang.Object 11: 106233 4249320 java.util.HashMap$KeyIterator 12: 16166 4090488 long 13: 126977 4063264 java.util.concurrent.ConcurrentHashMap$Node 14: 150789 3618936 java.util.ArrayList 15: 130167 3546016 java.lang.String 16: 156237 3227152 java.lang.Class 17: 33145 2916760 java.lang.reflect.Method 18: 32193 2575440 nonapi.io.github.classgraph.fastzipfilereader.FastZipEntry 19: 17314 2051672 java.lang.Class 20: 32043 1794408 io.github.classgraph.ClasspathElementZip$1 21: 107918 1726688 java.util.HashSet 22: 105970 1695520 java.util.HashMap$KeySet
There are a few other capabilities in the
jmap command but they aren’t as useful as those two for day to day work.
Prints the system properties and the JVM flags. This isn’t as useful as the other commands since we probably know those already. However, if you’re debugging on a machine that isn’t yours this can be a helpful starting point:
jhsdb jinfo --pid 72640
jsnap prints a lot of internal metrics and statistics including the number of threads started since JVM launch, peak number of threads, etc. These are very useful if you want to tune elements such as thread pool sizes, etc.
jhsdb jsnap --pid 72640
This is the snipped version of the output for such a command:
Attaching to process ID 72640, please wait... Debugger attached successfully. Server compiler detected. JVM version is 11.0.13+8-LTS java.threads.started=418 event(s) java.threads.live=12 java.threads.livePeak=30 java.threads.daemon=8 java.cls.loadedClasses=16108 event(s) java.cls.unloadedClasses=0 event(s) java.cls.sharedLoadedClasses=0 event(s) java.cls.sharedUnloadedClasses=0 event(s) java.ci.totalTime=23090159603 tick(s) java.property.java.vm.specification.version=11 java.property.java.vm.specification.name=Java Virtual Machine Specification java.property.java.vm.specification.vendor=Oracle Corporation java.property.java.vm.version=11.0.13+8-LTS java.property.java.vm.name=OpenJDK 64-Bit Server VM java.property.java.vm.vendor=Azul Systems, Inc. java.property.java.vm.info=mixed mode java.property.jdk.debug=release
I’ll skip the CLI debugger. You can launch the GUI debugger with no argument and use the file menu to connect to a core file, server or process ID. Or you can use the standard command line that works for every other command. For the command line case we can just use:
jhsdb hsdb --pid 72640
Which launches the debug UI:
The amount of capabilities and information exposed in this tool is absolutely fantastic. A lot of it might be too low level if we’re debugging from the Java side. But if you’re working with JNI native code, you can see the addresses matching the JVM thread/methods in the stack. This can put in context some native code debugging where the symbols might not be exposed coherently.
You can get the stack information for a specific thread (similar to
jstack). The inspector tool contains deep internal JVM state information per thread that can provide internal status. Most of that information isn’t as useful for debugging typical application level bugs but if you’re trying to understand the root cause of a JVM crash it might come in handy. Especially if you’re somewhat familiar with VM internal monikers.
There are many other features in the “Tools” menu such as Class Browser, Code Viewer, Memory viewer, etc. All of those are very low level access tools that might come in handy for tracking crashes or niche edge case issues.
jhsdb is a remarkably powerful and useful tool that I hope you will add into your toolbox. Its biggest fault is that it’s too powerful and too low level. As a result, most Java developers might find it intimidating. This is fair when dealing with some of its more esoteric features that only make sense when interacting with native code.
jhsdb also includes many high-level capabilities from deadlock detection to JVM information. Those are easily accessible and easy to use. They’re performant enough for production environments and since they’re a part of the JDK, they should be available out of the box and in the command line. When you need to explore a JVM crash, often after the fact where all you have is a core file… This is the best tool for the job. It’s also a pretty great tool if you need to tune your GC, thread pools or JVM configurations. If you’re debugging JNI code and misbehavior, this can be the tool that makes a difference.