To investigate the per-thread CPU usage on Linux, use command ‘top’ with the -H option, which provides an additional per thread information, which is not provided by default ‘top’ usage.
The output of ‘top -H’ on Linux shows the breakdown of the CPU usage on the machine by individual threads. The top output has the following sections of interest:
top - 16:15:45 up 21 days, 2:27, 3 users, load average: 17.94, 12.30, 5.52
Tasks: 150 total, 26 running, 124 sleeping, 0 stopped, 0 zombie Cpu(s): 87.3% us, 1.2% sy, 0.0% ni, 27.6% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 4039848k total, 3999776k used, 40072k free, 92824k buffers Swap: 2097144k total, 224k used, 2096920k free, 1131652k cached
The Cpus(s) row in this header section shows the CPU usage in terms of the following:
us - Percentage of CPU time spent in user space.
sy - Percentage of CPU time spent in kernel space.
ni - Percentage of CPU time spent on low priority processes.
id - Percentage of CPU time spent idle.
wa - Percentage of CPU time spent in wait (on disk).
hi - Percentage of CPU time spent handling hardware interrupts.
si - Percentage of CPU time spent handling software interrupts.
The "us", "sy" and "id" values are useful as the user, system (kernel) and idle CPU time respectively.
The next section shows the per-thread breakdown of the CPU usage.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31253 user1 16 0 2112m 2.1g 1764 R 37.0 53.2 0:39.89 java
31249 user1 16 0 2112m 2.1g 1764 R 15.5 53.2 0:38.29 java
31244 user1 16 0 2112m 2.1g 1764 R 13.6 53.2 0:40.05 java
31250 user1 16 0 2112m 2.1g 1764 R 13.6 53.2 0:41.23 java
31242 user1 16 0 2112m 2.1g 1764 R 12.9 53.2 0:40.56 java
31238 user1 16 0 2112m 2.1g 1764 S 12.6 53.2 1:22.21 java
31246 user1 16 0 2112m 2.1g 1764 R 12.6 53.2 0:39.62 java
31248 user1 16 0 2112m 2.1g 1764 R 12.6 53.2 0:39.40 java
31258 user1 16 0 2112m 2.1g 1764 R 12.6 53.2 0:39.98 java
31264 user1 17 0 2112m 2.1g 1764 R 12.6 53.2 0:39.54 java
31243 user1 16 0 2112m 2.1g 1764 R 12.2 53.2 0:37.43 java
31245 user1 16 0 2112m 2.1g 1764 R 12.2 53.2 0:37.53 java ...
This provides the following information per thread basis:
31253 user1 16 0 2112m 2.1g 1764 R 37.0 53.2 0:39.89 java
31249 user1 16 0 2112m 2.1g 1764 R 15.5 53.2 0:38.29 java
31244 user1 16 0 2112m 2.1g 1764 R 13.6 53.2 0:40.05 java
31250 user1 16 0 2112m 2.1g 1764 R 13.6 53.2 0:41.23 java
31242 user1 16 0 2112m 2.1g 1764 R 12.9 53.2 0:40.56 java
31238 user1 16 0 2112m 2.1g 1764 S 12.6 53.2 1:22.21 java
31246 user1 16 0 2112m 2.1g 1764 R 12.6 53.2 0:39.62 java
31248 user1 16 0 2112m 2.1g 1764 R 12.6 53.2 0:39.40 java
31258 user1 16 0 2112m 2.1g 1764 R 12.6 53.2 0:39.98 java
31264 user1 17 0 2112m 2.1g 1764 R 12.6 53.2 0:39.54 java
31243 user1 16 0 2112m 2.1g 1764 R 12.2 53.2 0:37.43 java
31245 user1 16 0 2112m 2.1g 1764 R 12.2 53.2 0:37.53 java ...
This provides the following information per thread basis:
PID - The thread ID. This can be converted into hexadecimal and used to correlate to the "native ID" in a java thread dump file.
USER - The user ID of the user that started the process.
PR - The priority of the thread.
NI - The "nice" value for the process.
VIRT - The virtual memory (allocated) usage of the process.
RES - The resident memory (committed) usage of the process.
SHR - The shared memory usage of the process.
S - The state of the thread. This can be one of the following:
R - Running
S - Sleeping
D - Uninterruptible sleep
T - Traced
Z - Zombie
%CPU - The percentage of a single CPU usage by the thread.
%MEM - The percentage of the memory used by the process.
TIME+ - The amount of CPU time used by the thread.
COMMAND - The name of the process executable.
Note that the "Cpu(s)" line in the header of the output shows the percentage usage across all of the available CPUs, whereas the %CPU column above represents the percentage usage of a single CPU. Thus for example, on a four-CPU machine the Cpu(s) row will total 100% and the %CPU column will total to 400%. To see the per cpu usage in header section, press 1.
What to look for from the top -H output?
In the per-thread breakdown of the CPU usage shown above, the Java process is taking approximately 75% of the CPU usage. This value is found by totaling the %CPU column for all the Java threads (not all threads are shown above) and dividing by the number of CPUs. The Java process is not limited by other processes, because the CPU there is still approximately 25% idle.
You can also see that the CPU usage of the Java process is spread reasonably evenly over all of the threads in the Java process. This spread implies that no one thread has a particular problem. Although the application is allowed to use most of the available CPU, approximately 25% of the total CPU is idle meaning that some points of contention or delay in the Java process can be identified.
A report indicating that active processes are using a small percentage of CPU, even though the machine appears idle, means that the performance of the application is probably limited by points of contention or process delay, preventing the application from scaling to use all of the available CPU.
If a deadlock is present, the reported CPU usage for the Java process is low or zero.
If threads are looping, the Java CPU usage approaches 100%, but a small number of the threads account for all of that CPU time.
Whenever you have threads of interest, note the PID values, convert them to a hexadecimal value and look up the threads in thread dump file to discover the name of application thread. Then look at the thread stack trace to understand the kind of work it is doing.
How to generate thread dump for Java process?
The thread dump for a java process can be generated using command:
/bin/kill -3 <pid>