The term ‘Cobra effect’ stems from an anecdote set at the time of British rule of colonial India. The British government was concerned about the number of venomous cobra snakes. The Government therefore offered a reward for every dead snake. Initially this was a successful strategy as large numbers of snakes were killed for the reward. Eventually however Indians began to breed cobras for the income.
When this was realized the reward was canceled, but the cobra breeders set the snakes free and the wild cobras consequently multiplied. The apparent solution for the problem made the situation even worse.
So how is Java heap size related with Colonial India and poisonous snakes? Bear with me and I’ll guide you through the analogy using a story from a real life as a reference.
You have created an amazing application. So amazing that it becomes truly popular and the sheer amount of traffic to your new service starts to push your application to its knees. Digging through the performance metrics you decide that the amount of heap available for your application will soon become a bottleneck.
So you take the time to launch new infrastructure with six times the original heap. You test your application to verify that it works. You then launch it on the new infrastructure. And immediately complaints start flowing in – your application has become less responsive than with your original tiny 2GB heap. Some of your users face delays in length of minutes when waiting for your application to respond. What has just happened?
There can be numerous reasons of course. But let’s focus on the most likely suspect – heap size change. This has several possible side effects like extended caching warmup times, problems with fragmentation, etc. But from the symptoms experienced you are most likely facing latency problems in your application during full GC runs.
What this means is – as Java is a garbage collected language – your heap used is regularly being garbage collected by JVM internal processes. And as one might expect – if you have a larger room to clean then it tends to take more time for the janitor to clean the room. The very same applies to cleaning unused objects from memory.
When running applications on small heaps (below 4GB) you often do not need to think about GC internals. But when increasing heap sizes to tens of gigabytes you should definitely be aware of the potential stop-the-world pauses induced by the full GC. The very same pauses did also exist with small heap sizes, but their length was significantly shorter – your pauses that now last for more than a minute might have originally spanned only a few hundred milliseconds.
So what can you do in cases when you really need more heap for your application?
But be aware of their limitations as well – both of those collectors pose throughput overhead on your application – especially G1 tends to show worse throughput numbers than the stop-the-world alternatives. And when the CMS garbage collector is not fast enough to finish operation before the tenured generation is full, it falls back to the standard stop-the-world GC. So you can still face 30 or more second pauses for heaps of size 16 GB and beyond.
To conclude – even when making changes backed with good intentions, be aware of both the alternatives and the consequences. Just as the Government of India back in the days publishing rewards for dead cobras.