This tool can be used to analyze the high-cpu.out and high-cpu-tdump.out files generated by the scripts on How do I identify high CPU utilization by Java threads on Linux/Solaris from the Red Hat Customer Portal.
NOTE: This tool is similar to Red Hat Labs' JVMPeg and javahighcpu, but provides more useful features (I think) and does not require Node.js.
Running the tool with the directory containing the high-cpu.out and high-cpu-tdump.out will generate output like the following:
$ ./high-cpu.py /tmp
================================================================================
DATE/TIME: 2022-12-13 15:10:53
HIGH-CPU : /tmp/high-cpu.out
TDUMP : /tmp/high-cpu-tdump.out
CPU : 2.5 us, 0.8 sy, 95.9 id, load average: 0.92 / 0.53 / 0.45
CPU% : 0.00%
TASKS : 83 total
--------------------------------------------------------------------------------
47953 0xbb51 0.00 2.30 java
47954 0xbb52 0.00 2.30 DestroyJavaVM
47955 0xbb53 0.00 2.30 GC task thread#0 (ParallelGC)
47956 0xbb54 0.00 2.30 GC task thread#1 (ParallelGC)
47957 0xbb55 0.00 2.30 GC task thread#2 (ParallelGC)
...
================================================================================
DATE/TIME: 2022-12-13 15:11:13
HIGH-CPU : /tmp/high-cpu.out
TDUMP : /tmp/high-cpu-tdump.out
CPU : 0.8 us, 0.8 sy, 97.6 id, load average: 0.95 / 0.56 / 0.46
CPU% : 0.00%
TASKS : 83 total
--------------------------------------------------------------------------------
47953 0xbb51 0.00 2.30 java
47954 0xbb52 0.00 2.30 DestroyJavaVM
47955 0xbb53 0.00 2.30 GC task thread#0 (ParallelGC)
47956 0xbb54 0.00 2.30 GC task thread#1 (ParallelGC)
47957 0xbb55 0.00 2.30 GC task thread#2 (ParallelGC)
...
The top of each output will contain the date, the files used, as well as some info from top
(high-cpu.out).
After that, the tool will print out a line for each thread using the thread name from high-cpu-tdump.out. Each line contains the PID (decimal), Native ID (hex), CPU usage, Memory usage, and the thread name.
PID NID CPU MEM Name
47955 0xbb53 0.00 2.20 GC task thread#0 (ParallelGC)
Use the -S
option to include the thread dump in the output.
$ ./high-cpu.py -S /tmp
...
47964 0xbb5c 0.00 2.20 Reference Handler
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000b06fbca8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000b06fbca8> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
Locked ownable synchronizers:
- None
47965 0xbb5d 0.00 2.20 Finalizer
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000b0b0cd98> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000b0b0cd98> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:188)
Locked ownable synchronizers:
- None
47966 0xbb5e 0.00 2.20 Signal Dispatcher
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
You can combine the -S
and -I
options to get output that more closely resembles the high-cpu-tdump.out file.
$ ./high-cpu.py -SI /tmp
...
47964 0xbb5c 0.00 2.20 "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x000055ed1c515000 nid=0xbb5c in Object.wait() [0x00007f49fb26f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000b06fbca8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000b06fbca8> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
Locked ownable synchronizers:
- None
47965 0xbb5d 0.00 2.20 "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x000055ed1c51f800 nid=0xbb5d in Object.wait() [0x00007f49fb16f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000b0b0cd98> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000b0b0cd98> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:188)
Locked ownable synchronizers:
- None
47966 0xbb5e 0.00 2.20 "Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x000055ed1c54c000 nid=0xbb5e runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
You can use the -T
(thread name) and -t
options to filter the results by thread name or PID or NID.
This is useful for tracking one or multiple threads over multiple timestamps.
NOTE: you can use these options multiple times.
NOTE: regular expressions are allowed!
$ ./high-cpu.py -T 'default I/O-[0-9]+' /tmp
...
48060 0xbbbc 0.00 2.20 default I/O-1
48061 0xbbbd 0.00 2.20 default I/O-2
48062 0xbbbe 0.00 2.20 default I/O-3
48063 0xbbbf 0.00 2.20 default I/O-4
$ ./high-cpu.py -t 48060 -t 0xbbbd /tmp
...
48060 0xbbbc 0.00 2.20 default I/O-1
48061 0xbbbd 0.00 2.20 default I/O-2
Use the -l
option to limit the number of threads printed for each timestamp.
-l 10
would limit the result to the first 10 threads.
You can also only print threads over a certain CPU percentage with the --cpu
option.
Specifying --cpu 4.8
would only show threads that are using more than 4.8% of the CPU.
This tool should allow you to handle dates in different locales.
You can use the --datetime-format
and --datetime-regex
options to specify custom date time formats.
For example, to handle files with dates in Japanese (such as 2022年 12月 13日 火曜日 15:10:53 JST
), you can run the tool like so:
LC_TIME=ja_JP.UTF-8 ./high-cpu.py --datetime-regex '[0-9]{4}年 +1?[0-9]月 +1?[0-9]日' --datetime-format '%Y年 %m月 %d日 %A %H:%M:%S %Z' ...
NOTE: if you need any locale specific items in the datetime format (ex: %A
for the weekday name), you need to make sure the tool is running in the right locale.
On Linux, you can specify the locale in the $LC_TIME
environment variable.
NOTE: the datetime-regex
does not need to match the full datetime format; just enough to distinguish it from other lines in the data files.
This tool should still work even if you have multiple high-cpu.out and high-cpu-tdump.out files in multiple directories.
For example, given the following directory structure:
/data
/20221201
high-cpu.out
high-cpu-tdump.out
/20221202
high-cpu.out
high-cpu-tdump.out
Running the tool with /data
as the data directory will find both high-cpu.out and high-cpu-tdump.out in the subdirectories and print out the data in the correct order.
If you save the data files in a high-cpu-PID.out and high-cpu-tdump-PID.out, you can specify the PID file using the -p
option.
For example, given the following:
/data
/20221201
high-cpu-100.out
high-cpu-tdump-100.out
high-cpu-200.out
high-cpu-tdump-200.out
/20221202
high-cpu-100.out
high-cpu-tdump-100.out
high-cpu-200.out
high-cpu-tdump-200.out
Running ./high-cpu.py -p 100 /data
will print out a report for process ID 100 files (high-cpu-100.out and high-cpu-tdump-100.out).