What is a memory leak?

February 15, 2012 by Nikita Salnikov-Tarnovski

When we talk to people about our solution for discovering memory leaks we immediately get positive feedback. But when we add Java into the equation, the initial excitement is often complemented with questions: “Are there memory leaks in Java? Isn’t Java a garbage-collected language?”

In this post I will explain why memory leaks are in fact a common problem for Java applications.

What is a “Memory leak” in Java?

Let us start with outlining what is the difference in the memory management in Java and, for example, C languages. When a C-programmer wants to use a variable, he has to manually allocate a region in the memory where the value will reside. After the application finishes using that value, the region of the memory must be manually freed, i.e. the code freeing the memory has to be written by the developer. In Java, when a developer wants to create and use a new object using, e.g. new Integer(5), he doesn’t have to allocate memory – this is being taken care of by the Java Virtual Machine (JVM). During the life of the application JVM periodically checks which objects in memory are still being used and which are not. Unused objects can be discarded and memory reclaimed and reused again. This process is called garbage collection and the corresponding piece of JVM is called Garbage Collector, or GC.

Java’s automatic memory management relies on GC which periodically looks for unused objects and removes them. And here hides the dragon. Simplifying a bit, we can say that a memory leak in Java is a situation where some objects are not used by application any more, but GC fails to recognize them as unused. As a result, these objects remain in memory indefinitely reducing the amount of memory available to the application.

Here I would like to stress one very important point: the notion of “object is not used by application any more” is totally, absolutely, 100% application specific! Apart from some specific cases, where lifespan of the object can be logically determined (such as local variable of the method, which does not under any circumstances escape the method), object usage can be understood only by the application developer taking into account all usage patterns of the application.

How can GC distinguish between the unused objects and the ones the application will use at some point in time in the future? The basic algorithm can be described as follows:

  1. There are some objects which are considered “important” by GC. These are called GC roots and are (almost) never discarded. They are, for example, currently executing method’s local variables and input parameters, application threads, references from native code and similar “global” objects.
  2. Any object referenced from those GC roots are assumed to be in use and not discarded. One object can reference another in different ways in Java, most commonly being when object A is stored in a field of object B. In that case, we say “B references A”
  3. The above process is repeated until all objected that can be transitively reached from GC roots are visited and marked as “in use”
  4. Everything else is unused and can be thrown away.

Now, it is fairly easy to construct a Java program that satisfies the above definition of a memory leak:

public class Calc {
   private Map cache = new HashMap();
   public int square(int i) {
      int result = i * i;
      cache.put(i, result);
      return result;
   }
   public static void main(String[] args) throws Exception {
      Calc calc = new Calc();
      while (true) 
         System.out.println("Enter a number between 1 and 100");
         int i = readUserInput(); //not shown
         System.out.println("Answer " + calc.square(i));
      }
   }
}

This program reads one number at a time from its user and calculates its square value. This implementation uses a primitive “cache” for storing the results of the calculation. But since these results are never read from the cache, the code block represents a memory leak according to our definition above. If we let this program run and interact with users long enough, the “cached” results consume a lot of memory.

This brings us to another important aspect of memory leaks: how big should the leak be to justify the trouble of investigating and fixing it? Technically, whenever you leave an object that you don’t use anymore, laying around, you create waste. Practically, a couple of kilobytes here and there don’t really constitute real problems for modern applications, especially the “enterprise” ones … But a leak is a leak, even if its just 200 bytes.

Which leads us to a simple corollary: a memory leak is like good wine – it needs aging. If you want to demonstrate the leak or, more importantly, fix it, you really should let it grow. Tiny memory leaks are lost within all those objects that are present in an application at any given point of time. Regardless of the tool you use for identify memory leaks - be it a profiler, a memory dump analyzer, an APM, or a special-purpose leak finder tool like Plumbr - there should be a lot of objects that outlived their usefulness. Which means that your application should run for a significant period of time AND as many different parts of your application should be executed as possible. Otherwise you will be looking for a needle in a haystack.

If you would like to know more about Java memory leaks, especially about different ways to hunt them down and fix them in your applications, check out our series of blog posts, titled “Solving OutOfMemoryError”. And stay tuned to our twitter @JavaPlumbr. Till next time!

Can't figure out what causes your OutOfMemoryError? Read more

ADD COMMENT

COMMENTS

you said: GC root objects are, for example, currently executing methodu2019s local variables and input parameters, application threads, references from native code and similar u201cglobalu201d objects.nSo If I have the following code snippet:nmethodA(){n List list = new linkedList();n}nyou mean the list wud never be discarded even after it goes out of scope?Could you pls clarify this point with some examples.

Dsf

It will be discarded. Because when this local variable, “list”, goes out of the scope, it will not qualify as “local variable” anymore. It will be popped out of the execution stack and so GC can reclaim it.

iNikem

A memory leak is when no pointers/references to an allocated block of memory exist. Secondly, in C and C++ you don’t need to allocate every variable on the heap. There is a stack, you know?nnThe most important thing to note is, that when you want caching in Java, you don’t use a direct reference to an object, that’s what is wrong with your example. Java has WeakReference, which tells the GC to keep the object alive, but to reclaim it when there is a lack of memory.

A532477

Errr, that’s not what a weak reference is. A weak reference tells the GC that you want to keep a reference to the object but that it should be collected as soon as all of the strong references to the object are removed.nnWhat you are describing is a soft reference, which is like a weak reference except that it also indicates that you would like the GC to be pessimistic about discarding the object and only do so if absolutely necessary.

Guest

You are right. We shouldn’t do such a simplistic “caching”. We should use Weak/SoftReferences, we should define expiration policy, we should restrict the size of your cache. But we also should write bug-free code :)

iNikem

This is more an example of a space leak rather than a memory leak because you still have the reference calc pointing to the object in the heap and the object size is growing with every loop…

mikematic

Yes, but that is a memory leak in Java world. Or can you give your definition and an example of memory leak, which is not “space leak”?

iNikem

Look below for definition of space leak. Memory leak is when you loose a pointer to the object in the heap which is not the case here but can happen in languages like C and C++nnhttp://encyclopedia2.thefreedictionary.com/space+leak

mikematic

It seems to me that the difference between a memory leak and a space leak is entirely a notional difference that has to do with the specific boundaries that Java places on application code versus runtime code.nnFrom the outside looking in, it makes little difference whether or not there is a Java reference pointing to some memory, or if it is actually “unreachable,” a term that is only defined in terms of certain programming constructs. Even a classic malloc()/free() memory leak in a C program leaves memory that is “reachable” via libc memory allocator internal structures.nnSo a memory leak and a space leak exhibit the exact same behavior, and are only different when considered in terms of abstract implementation specifics. Calling unused reachable memory in Java a memory leak sounds totally reasonable to me.

Guest

The JRockit memory leak detector, which will eventually make it into HotSpot is an invaluable tool in situations like this. Here is a free chapter from the book “Oracle JRockit – the definitive guide” by Marcus Lagergren & Marcus Hirt that goes into detail about this powerful tool. The chapter also talks about memory leaks in greater depth and is recommended further reading if you liked this blog post.nnhttp://www.packtpub.com/sites/default/files/8068-chapter-10-the-memory-leak-detector.pdf

Guest

Can't figure out what causes your OutOfMemoryError? Read more

Latest
Recommended
You cannot predict the way you die
When debugging a situation where systems are failing due to the lack of resources, you can no longer count on anything. Seemingly unrelated changes can trigger completely different messages and control flows within the JVM. Read more
Tuning GC - it does not have to be that hard
Solving GC pauses is a complex task. If you do not believe our words, check out the recent LinkedIn experience in garbage collection optimization. It is a complex and tedious task, so we are glad to report we have a whole lot simpler solution in mind Read more
Building a nirvana
We have invested a lot into our continuous integration / delivery infrastructure. As of now we can say that the Jenkins-orchestrated gang consisting of Ansible, Vagrant, Gradle, LiveRebel and TestNG is something an engineer can call a nirvana. Read more
Creative way to handle OutOfMemoryError
Wish to spend a day troubleshooting? Or make enemies among sysops? Registering pkill java to OutOfMemoryError events is one darn good way to achieve those goals. Read more