Wednesday, August 13, 2008

One reason I like to take thread dumps during performance testing

I've written before about thread dumps and the value of taking them during a performance test. The other day we finished our baseline testing so we started a duplicate test simply for taking thread dumps. Even though I wasn't seeing any anomalies in the baseline this is just something I do to make sure I dot all my i's and cross my t's. Plus, if there are any anomalies they will show up in the thread dump.

Lo and behold in the thread dumps (remember we take at least 3 thread dumps spread at least 2 minutes apart) I found a number of threads sitting on

at java/net/Inet6AddressImpl.lookupAllHostAddr(Native Method)

which seemed odd to me. I live by the rules of mathematics and its definition of randomness. A random thread dump at any random point in time should result in the threads doing random different things in each thread dump. If one thread dump in the series shows a couple of threads doing the same thing then that is odd. If more than one thread dump in the series shows more than one thread doing the same thing then we have a bottleneck! Bottlenecks can limit an application's ability to use CPU and keep the throughput down. If you can't fix the bottleneck then you'll need more hardware to scale up which means spending more money. If you can afford that then stop here and call your finance guy.

I searched the PMR database and found that indeed there is an interesting side effect to IPv6 and it was affecting the throughput of the application I was testing! Fortunately the PMR referenced a technote on the subject and I'm hoping we can eliminate this issue. The good thing that will come out of this is we will see a throughput improvement in the application once we apply the proper configuration. The improved throughput will mean an immediate cost savings in additional hardware we would have had to purchase to make up for the differential. A win win for everyone (well, except for the IBM hardware sales folks but c'est la vie!).

Although in this particular multi-tiered environment (Process Server talking to WebSphere Application Server talking to CICS) I still have to go back and collect thread dumps on Process Server once I'm satisfied I am not seeing any other issues in the WAS tier. Who knows what anomalies that will uncover (hopefully none so this testing can wrap up soon).

IBM - HostName lookup causes a JVM hang or slow response
If the DNS is not setup to handle IPv6 queries properly, the application must wait for the IPv6 query to time out.

2 comments:

Leroy said...

did you ever apply a PMR and see this fixed???

We are seeing something similar.

Leroy said...
This comment has been removed by the author.