Throughout those years we have kept a close eye on the problems packaged in different flavours of OutOfMemoryError message. Daily digests on new questions on specific keywords via specialised services such as the Google Alerts have given us a good overview about the situations where applications fail with the java.lang.OutOfMemoryError in logs.
The people facing the problem tend to fall into pretty well-segmented buckets, so I decided to describe some of the more interesting personas a bit.
Self-taught surgeons. These guys are now truly creative, I definitely have to give them credit for this. When faced with unexpected error messages, they come up with a sea of explanations why this particular error might have been occurred. And jump right into fixing the problem.
I have seen all of the following for so many times that I have even lost the count. I can only warrant that the examples are both real and scary:
The list could go on forever. What makes me curious is – when software developers behave like this – should I be more careful when approaching my doctor for the next time as well?
Configuration H2x0rz. If there is a parameter in the JVM configuration, it has to be tweaked. This seems to be the only truth for this particular group. Indeed, what is the chance that the defaults recommended by Oracle JVM engineers would make any sense. The result? Applications launching with way too large minimum heaps, mangled thread priorities, vastly underutilized tenured areas or unsuitable and/or experimental GC algos.
Do not get me wrong, if you know what you are doing and are building this fine-tuning based on actual measurements, go ahead. Most often than not, these type of users have inherited “the right set of configuration options” from somewhere and are now applying the same set of -XX parameters to each and every JVM they see out there. Please, don’t. The highly transactional webapp you last faced is a completely different beast than the data-hungry batch job you have at your fingertips at the moment.
Victims of Data Surge. Those guys have been building and running a business app for years without major performance issues. And then the lightning strikes and the app is lying dead on the ground with OutOfMemoryErrors in the log files. Some of the users are suddenly able to run the whole app to the ground by executing an operation loading too much data at once into the memory.
Whether it is caused by business being good and the amount of customers has just grown beyond certain magic point or whether the company has acquired and merged a competitor doubling the amount of data, the effect is the same.
Situations like this, when identified can be resolved by applying a number of well-known tools and techniques. You can defer data loading, process the operation in smaller batches or change the data structure responsible for storing this data – up to you. Many of those solutions could be a good fit.
But what we instead see in such cases is hiding the problem under increased heap size. You can indeed escape the OutOfMemoryError by just increasing the -Xmx in your configuration, but more often than not you are still doing a disservice to the users. Large operations still take long time to complete, thus annoying the users via the increased latency. Worse yet, when increasing the heap you are often causing GC pauses to span to intolerable lengths.
Minecrafters. If I had to pick a single application responsible for memory leaks it would be Minecraft. Over the years I have most likely seen thousands of frustrated 9-year olds forced to deal with the heap configuration.
A quick googling surfaces the extent of the problem, which I guess is a good enough case study for anyone considering shipping desktop software built on Java.
If you did not feel yourself belonging to any of the groups listed, good. You are among the pragmatic engineers taking pride in your craft by investigating the cause and effect relations carefully before jumping to conclusions. And I can only recommend you to subscribe to our Twitter feed.