Tuesday, December 21, 2010

Identify the Java process hogging all the CPU

Identify the Java process hogging all the CPU

To do that, use the top or prstat command to get that process ID

$ top
last pid: 25837; load averages: 0.06, 0.15, 0.36 16:14:18
73 processes: 72 sleeping, 1 on cpu
CPU states: 0.0% idle, 99.1% user, 0.9% kernel, 0.0% iowait, 0.0% swap
Memory: 4096M real, 1191M free, 2154M swap in use, 1306M swap free

PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
1864 rdadmin 53 59 0 184M 88M sleep 7:38 98.10% java
27794 iwui 39 59 0 222M 186M sleep 882:38 0.42% java
27708 iwui 71 59 0 52M 34M sleep 82:40 0.06% java
24025 root 29 29 10 232M 146M sleep 716:03 0.04% java
23449 rdadmin 1 59 0 2616K 2112K sleep 0:00 0.02% bash
287 root 13 59 0 6736K 5560K sleep 110:25 0.01% picld

$ prstat
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
1864 rdadmin 184M 88M sleep 59 0 0:07:39 98.10% java/53
27794 iwui 222M 186M sleep 59 0 14:42:46 0.9% java/39
26030 rdadmin 4592K 4200K cpu2 49 0 0:00:00 0.1% prstat/1
27708 iwui 52M 34M sleep 59 0 1:22:41 0.1% java/71
24025 root 232M 146M sleep 29 10 11:57:18 0.1% java/29
287 root 6736K 5560K sleep 59 0 1:50:25 0.0% picld/13
15686 oemadmin 107M 93M sleep 29 10 1:13:19 0.0% emagent/5
15675 oemadmin 8096K 7432K sleep 29 10 0:12:34 0.0% perl/1

Depending on the number of CPU in the server and the number of CPU hogging thread. So if there is only 1 thread hogging cpu on a 2 processors server, the process associate with that thread will consume 50%.

2- Identify the thread ID that are hogging all CPU

To do that, we will use the prstat command with a special switch:

bash-2.03# prstat -L
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
1864 rdadmin 985M 577M cpu2 0 2 0:58.24 22% java/84
1864 rdadmin 985M 577M cpu0 0 2 0:51.38 22% java/138
1864 rdadmin 985M 577M cpu3 0 2 0:50.23 22% java/122
1864 rdadmin 985M 577M run 0 2 0:58.41 21% java/145
26430 tomcat 877M 456M sleep 52 2 0:09.27 1.2% java/9
26399 tomcat 985M 577M sleep 52 2 0:11.09 0.9% java/9
26461 tomcat 481M 324M sleep 52 2 0:06.36 0.7% java/9
26430 tomcat 877M 456M sleep 47 2 0:00.11 0.5% java/174
26430 tomcat 877M 456M sleep 28 2 0:00.09 0.3% java/93
26492 tomcat 460M 300M sleep 52 2 0:06.01 0.2% java/9
26430 tomcat 877M 456M sleep 22 2 0:00.14 0.2% java/129
26461 tomcat 481M 324M sleep 32 2 0:00.04 0.2% java/104

Here you can see 4 threads that are using all of the machine cpu.

3- Get the real thread ID

For an unknown to me reason, prstat shows only the LWPID (lightweigth process), see at the end of a prstat output line. We now need to find the real thread ID. Generally, they are the same, but they may be different too. To do that, use the pstack command. Be sure to be logged with the process owner or root. Grep the pstack output to only get the significant lines:

$ pstack 1864 | grep "lwp#" | grep "84"
----------------- lwp# 84 / thread# 83 --------------------

The real thread ID is 83

Now convert that number in hex: 53

4- Get the Java process stack real time

Now comes the interesting part, get the Java process stack for all thread. To do that, you need to send a special signal to the process using the kill command (don't worry, this will not kill the process, only force the JVM to dump its thread stack to the default output log).

We will do this 2 commands at a time, to be sure to get the output from the log. So first change your current directory to tomcat log directory (for a tomcat process).

$ cd /applications/tomcat/servers/tomcat1_rd1/logs

Then send the a signal #3 to the Java process to force it to dump its thread stack, then tail the log.

$ kill -3 1864 ; tail -f catalina.out

Wait maybe 1-3 seconds then you will se a large output be added to the log:

Look at the stack and locate the nid=0x### which is your thread ID in hex (53 in the example)


TP-Processor24" daemon prio=10 tid=0x00a0b180 nid=0x53 runnable [0xb3e80000..0xb3e81a28]
at java.util.ArrayList.indexOf(ArrayList.java:221)
at java.util.ArrayList.contains(ArrayList.java:202)
at com.company.content.business.ContentHandler.getRandomContent(ContentHandler.java:248)
at com.
company.rd.action.CycleContentTypeAction.execute(CycleContentTypeAction.java:100)
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:480)
at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:274)
at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1420)
at org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:502)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at com.readersdigest.content.filter.MemberContentBundleStatusFilter.doFilter(MemberContentBundleStatusFilter.java:319)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at com.readersdigest.rd.filter.GenricRequestValuesFilter.doFilter(GenricRequestValuesFilter.java:94)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at com.readersdigest.rd.filter.BreadcrumbFilter.doFilter(BreadcrumbFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at com.readersdigest.rd.filter.RdAutoLoginFilter.doFilter(RdAutoLoginFilter.java:71)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at com.readersdigest.servlet.filters.GrabTrackingParametersFilter.doFilter(GrabTrackingParametersFilter.java:81)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at com.readersdigest.servlet.filters.HibernateFilter.doFilter(HibernateFilter.java:78)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at com.readersdigest.servlet.filters.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:134)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:307)
at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:385)
at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:748)
at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:678)
at org.apache.jk.common.SocketConnection.runIt(ChannelSocket.java:871)
at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
at java.lang.Thread.run(Thread.java:595)


You now have a running thread stack to investigate possible infinite loops or something else that is hogging all CPU.