Solving OutOfMemoryErrors (part 7) – APM tools as a solution?

April 30, 2012 by Nikita Salnikov-Tarnovski

Our series of posts about solving the OutOfMemoryError in our hypothetical production system nearing the conclusion. We have covered a number of different ways to attack OutOfMemoryErrors so far – profilers, JDK bundled tools and heap dump analyzers. Our today’s mercenaries will be Application Performance Management tools, or APMs.

APM solutions are positioned as the Holy Grails on the quest for solving your production environment’s performance problems. You just setup an APM tool of your choice, let it monitor your whole cluster from front-end load-balancer - or even the end user’s browser - down to your Oracle or Neo4j database and then relax in that Aeron chair of yours. APM provides you with all the information your operations or developers need in order to achieve a desired level of customer satisfaction.

If you thought there was a grain of sarcasm in what I have just said, you are right. I was always suspicious about Jacks of all trades. But let us cast all the doubts aside and just try them out.

Our fist list of APMs to try consisted of 5 names: AppDynamicsCA APM (former Introscope), dynaTraceHPjmeter and New Relic.

HPjmeter fell away at once, as it is available for HP-UX platforms only. New Relic has no memory leak tracking capabilities yet, so there we lost a second contestant. For CA APM and dynaTrace we were unable to obtain a free evaluation version. This left us with AppDynamics alone. Kudos for the AppDynamics team who gladly provided us the opportunity to try the solution!

AppDynamics installation provides the AppDynamics Controller, a central web-based dashboard, that is collecting and processing data from Java Application Server Agents. The agents are JVM agents, just like Plumbr. These agents can be attached to different servers and applications. Attached agents send runtime information to Controller. With our sample application the AppDynamics Controller dashboard looks like this:
AppDynamics in action

If we dive a little deeper, we can see memory related data per node:
AppDynamics in action - vol 2

The screenshot above is only a subset of the information provided on the screen. It is scrollable all the way down for much more graphs and trends and data. Which is all nice, but not exactly what we were looking for.

As our goal was to let AppDynamics help us find the root cause of the memory leak we have suffered from so long, we switch to “Automatic Leak Detection” tab and activate it (it is switched off by default). Then we start an “On Demand Capture Session” and let the application run under the load of our stress tests. The results are… a bit disappointing. Even after several re-tries with different parameter values that AppDynamics allows to configure, the result was always the same:

This all happened while my application was crashing with the OOM. No luck here today.

Conclusion

It was very difficult for me to write this post. First of all I had only one tool to test, and getting even this one was not an easy task. Secondly, I wasted several hours before I saw the first bit of information in AppDynamics dashboard (do not ask, I will not write about it). And thirdly, I have no results to show to my readers. But here we are and what can I conclude from this experience?

  1. APM tools require planning. If you have fire in the house, it is too late to go fetching them. Like in case of life insurance you should have thought about them way before the moment you really need them.
  2. APM tools give you tons of information about inner workings of your application performance-wise. But I was hoping for something answering my question ”Why do I have a memory leak and what should I do?” more directly.
  3. All in all, APM solutions can be great tools for monitoring and proactive planning of performance related maintenance of your application. I cannot recommend it for problem solving though.

P.S. If you can point to any other tool for solving memory problems in Java applications that we have missed, or can provide us with evaluation licenses for aforementioned products, we will be glad to hear from you. Just contact us at support@plumbr.eu or @JavaPlumbr.

Can't figure out what causes your OutOfMemoryError? Read more

ADD COMMENT

Can't figure out what causes your OutOfMemoryError? Read more

Latest
Recommended
You cannot predict the way you die
When debugging a situation where systems are failing due to the lack of resources, you can no longer count on anything. Seemingly unrelated changes can trigger completely different messages and control flows within the JVM. Read more
Tuning GC - it does not have to be that hard
Solving GC pauses is a complex task. If you do not believe our words, check out the recent LinkedIn experience in garbage collection optimization. It is a complex and tedious task, so we are glad to report we have a whole lot simpler solution in mind Read more
Building a nirvana
We have invested a lot into our continuous integration / delivery infrastructure. As of now we can say that the Jenkins-orchestrated gang consisting of Ansible, Vagrant, Gradle, LiveRebel and TestNG is something an engineer can call a nirvana. Read more
Creative way to handle OutOfMemoryError
Wish to spend a day troubleshooting? Or make enemies among sysops? Registering pkill java to OutOfMemoryError events is one darn good way to achieve those goals. Read more