How is ThreadLocal implemented?

October 29, 2013 by Nikita Salnikov-Tarnovski

This is a follow-up to my last week post, where I explained the motivation behind ThreadLocal usage. From the post we could recall that ThreadLocal is indeed a cool concept if you wish to have an independently initialized copy of a variable for each thread. Now, the curious ones might have already started asking “how could I implement such a concept in Java”?

Or you might feel that it will not be interesting topic – after all, all you need in here is a Map, isn’t it? When dealing with a ThreadLocal<T> it seems to make all the sense in the world to implement the solution as HashMap<Thread,T> with Thread.currentThread() as the key. Actually it is not that simple. So if you have five minutes, bear with me and I will guide you through a beautiful design concept.

First obvious problem with the simple HashMap solution is the thread-safety. As HashMap is not built to support concurrent usage, we cannot safely use the implementation in the multi-threaded environment. Fortunately we do not need to look far for the fix – the ConcurrentHashMap<Thread, T> looks like a match made in heaven. Full concurrency of retrievals and adjustable expected concurrency for updates is exactly what we need in the first place.

Now, if you would apply a solution based on the ConcurrentHashMap to the ThreadLocal implementation in the JDK source you would have introduced two serious problems.

  • First and foremost, you are having Threads as keys in the Map structure. As the map is never garbage collected, you end up keeping a reference to the Thread forever, blocking the thread from being GCd. Unwillingly you have created a massive memory leak in the design.
  • Second problem might take longer to surface, but even with the clever segmentation under the hood reducing the chance of lock contention, ConcurrentHashMap still bears a synchronization overhead. With the synchronization requirement still in place you still have a structure which is a potential source for the bottleneck.

But let us start solving the biggest issue first. Our data structure needs to allow threads to be garbage collected if our reference is the last one pointing to a thread in question. Again, the first possible solution is staring right at us – instead of our usual references to the object, why not use WeakReferences instead? So the implementation would now look similar to the following:

Collections.synchronizedMap(new WeakHashMap<Thread, T>())

Now we have gotten rid of the leakage issue – if nobody besides us is referring to the Thread, it can be finalized and garbage collected. But we still have not sorted out the concurrency issues. The solution to this is now really a sample about thinking outside of the box. So far we have thought about the ThreadLocal variables as Threads mapping to the variables. But what if we reverse the thinking and instead envision a solution as a mapping of ThreadLocal objects to values in each Thread? If each thread stores the mapping, and ThreadLocal is just an interface into that mapping, we can avoid the synchronization issues. Better yet, we are also escaping the problems with GC!

And indeed, when we open up the source code of ThreadLocal and Thread classes we see that this is exactly how the solution is actually implemented in JDK:

public class Thread implements Runnable {
	ThreadLocal.ThreadLocalMap threadLocals = null;
	// cut for brevity
}
public class ThreadLocal<T> {
	static class ThreadLocalMap {
		// cut for brevity
	}

	ThreadLocalMap getMap(Thread t) {
		return t.threadLocals;
	}

	public T get() {
		Thread t = Thread.currentThread();
		ThreadLocalMap map = getMap(t);
		if (map != null) {
			ThreadLocalMap.Entry e = map.getEntry(this);
			if (e != null)
				return (T) e.value;
		}
		return setInitialValue();
	}

	private T setInitialValue() {
		T value = initialValue();
		Thread t = Thread.currentThread();
		ThreadLocalMap map = getMap(t);
		if (map != null)
			map.set(this, value);
		else
			createMap(t, value);
		return value;
	}
	// cut for brevity
}

So here we have it. Thread class keeps a reference to a ThreadLocal.ThreadLocalMap instance, which is built using weak references to the keys. Building the structure in a reverse manner we have avoided thread contention issues altogether as our ThreadLocal can only access the value in the current thread. Also, when the Thread has finished the work, the map can garbage collected, so we have also avoided the memory leak issue.

I hope you felt enlightened when looking into the design, as it is indeed an elegant solution to a complex problem. I do feel that reading source code is a perfect way to learn about new concepts. And if you are a Java developer – what could be a better place to get the knowledge than reading Joshua Bloch and Doug Lea source code integrated to the JDK? Besides subscribing to our RSS or Twitter feeds of course …

Can't figure out what causes your OutOfMemoryError? Read more

ADD COMMENT

COMMENTS

Came across another in depth walkthrough of ThreadLocal’s underlying implementation on
StackOverflow.

Dan D

Can't figure out what causes your OutOfMemoryError? Read more

Latest
Recommended
You cannot predict the way you die
When debugging a situation where systems are failing due to the lack of resources, you can no longer count on anything. Seemingly unrelated changes can trigger completely different messages and control flows within the JVM. Read more
Tuning GC - it does not have to be that hard
Solving GC pauses is a complex task. If you do not believe our words, check out the recent LinkedIn experience in garbage collection optimization. It is a complex and tedious task, so we are glad to report we have a whole lot simpler solution in mind Read more
Building a nirvana
We have invested a lot into our continuous integration / delivery infrastructure. As of now we can say that the Jenkins-orchestrated gang consisting of Ansible, Vagrant, Gradle, LiveRebel and TestNG is something an engineer can call a nirvana. Read more
Creative way to handle OutOfMemoryError
Wish to spend a day troubleshooting? Or make enemies among sysops? Registering pkill java to OutOfMemoryError events is one darn good way to achieve those goals. Read more