To blog |

G1 Garbage Collector in Action

January 19, 2016 by Gleb Smirnov Filed under: Garbage Collection

This post is an excerpt from the G1 chapter in our Garbage Collection Handbook. With the Java 9 release now just around the corner the adoption of G1 will likely skyrocket, so we went ahead and explained the ins and outs of the collector in more details in preparation.

One of the key design goals of G1 was to make the duration and distribution of stop-the-world pauses due to garbage collection predictable and configurable. In fact, Garbage-First is a soft real-time garbage collector, meaning that you can set specific performance goals to it. You can request the stop-the-world pauses to be no longer than x milliseconds within any given y-millisecond long time range, e.g. no more than 5 milliseconds in any given second. Garbage-First GC will do its best to meet this goal with high probability (but not with certainty, that would be hard real-time).

To achieve this, G1 builds upon a number of insights. First, the heap does not have to be split into contiguous Young and Old generation. Instead, the heap is split into a number (typically about 2048) smaller heap regions that can house objects. Each region may be an Eden region, a Survivor region or an Old region. The logical union of all Eden and Survivor regions is the Young Generation, and all the Old regions put together is the Old Generation:

G1 Heap Regions

This allows the GC to avoid collecting the entire heap at once, and instead approach the problem incrementally: only a subset of the regions, called the collection set will be considered at a time. All the Young regions are collected during each pause, but some Old regions may be included as well:

G1 Collection Set

Another novelty of G1 is that during the concurrent phase it estimates the amount of live data that each region contains. This is used in building the collection set: the regions that contain the most garbage are collected first. Hence the name: garbage-first collection.

To run the JVM with the G1 collector enabled, run your application as

java -XX:+UseG1GC com.mypackages.MyExecutableClass

Evacuation pauses: Fully Young


In the beginning of the application’s lifecycle, G1 does not have any additional information from the not-yet-executed concurrent phases, so it initially functions in the fully-young mode. When the Young Generation fills up, the application threads are stopped, and the live data inside the Young regions is copied to Survivor regions, or any free regions that thereby become Survivor.

The process of copying these is called Evacuation, and it works in pretty much the same way as the other Young collectors we have seen before. The full logs of evacuation pauses are rather large, so, for simplicity’s sake we will leave out a couple of small bits that are irrelevant in the first fully-young evacuation pause. We will get back to them after the concurrent phases are explained in greater detail. In addition, due to the sheer size of the log record, the parallel phase details and “Other” phase details are extracted to separate sections:

0.134: [GC pause (G1 Evacuation Pause) (young), 0.0144119 secs]1
    [Parallel Time: 13.9 ms, GC Workers: 8]2
        3
    [Code Root Fixup: 0.0 ms]4
    [Code Root Purge: 0.0 ms]5
    [Clear CT: 0.1 ms]
    [Other: 0.4 ms]6
        7
    [Eden: 24.0M(24.0M)->0.0B(13.0M) 8Survivors: 0.0B->3072.0K 9Heap: 24.0M(256.0M)->21.9M(256.0M)]10
     [Times: user=0.04 sys=0.04, real=0.02 secs] 11

  1. 0.134: [GC pause (G1 Evacuation Pause) (young), 0.0144119 secs] – G1 pause cleaning only (young) regions. The pause started 134ms after the JVM startup and the duration of the pause was 0.0144 seconds measured in wall clock time.
  2. [Parallel Time: 13.9 ms, GC Workers: 8] – Indicating that for 13.9 ms (real time) the following activities were carried out by 8 threads in parallel
  3. – Cut for brevity, see the following section below for the details.
  4. [Code Root Fixup: 0.0 ms] – Freeing up the data structures used for managing the parallel activities. Should always be near-zero. This is done sequentially.
  5. [Code Root Purge: 0.0 ms] – Cleaning up more data structures, should also be very fast, but non necessarily almost zero. This is done sequentially.
  6. [Other: 0.4 ms] – Miscellaneous other activities, many of which are also parallelized
  7. – See the section below for details
  8. [Eden: 24.0M(24.0M)->0.0B(13.0M) – Eden usage and capacity before and after the pause
  9. Survivors: 0.0B->3072.0K – Space used by Survivor regions before and after the pause
  10. Heap: 24.0M(256.0M)->21.9M(256.0M)] – Total heap usage and capacity before and after the pause.
  11. [Times: user=0.04 sys=0.04, real=0.02 secs] – Duration of the GC event, measured in different categories:
    • user – Total CPU time that was consumed by Garbage Collector threads during this collection
    • sys – Time spent in OS calls or waiting for system event
    • real – Clock time for which your application was stopped. With the parallelizable activities during GC this number is ideally close to (user time + system time) divided by the number of threads used by Garbage Collector. In this particular case 8 threads were used. Note that due to some activities not being parallelizable, it always exceeds the ratio by a certain amount.

For parallel phases of the collection, the log entry can be interpreted as follows:

[Parallel Time: 13.9 ms, GC Workers: 8]1
     [GC Worker Start (ms)2: Min: 134.0, Avg: 134.1, Max: 134.1, Diff: 0.1]
    [Ext Root Scanning (ms)3: Min: 0.1, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.2]
    [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
        [Processed Buffers: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0]
    [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
    [Code Root Scanning (ms)4: Min: 0.0, Avg: 0.0, Max: 0.2, Diff: 0.2, Sum: 0.2]
    [Object Copy (ms)5: Min: 10.8, Avg: 12.1, Max: 12.6, Diff: 1.9, Sum: 96.5]
    [Termination (ms)6: Min: 0.8, Avg: 1.5, Max: 2.8, Diff: 1.9, Sum: 12.2]
        [Termination Attempts7: Min: 173, Avg: 293.2, Max: 362, Diff: 189, Sum: 2346]
    [GC Worker Other (ms)8: Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
    GC Worker Total (ms)9: Min: 13.7, Avg: 13.8, Max: 13.8, Diff: 0.1, Sum: 110.2]
    [GC Worker End (ms)10: Min: 147.8, Avg: 147.8, Max: 147.8, Diff: 0.0]

  1. [Parallel Time: 13.9 ms, GC Workers: 8] – Indicating that for 13.9 ms (clock time) the following activities were carried out by 8 threads in parallel
  2. [GC Worker Start (ms) – The moment in time at which the workers started their activity, matching the timestamp at the beginning of the pause. If Min and Max differ a lot, then it may be an indication that too many threads are used or other processes on the machine are stealing CPU time from the garbage collection process inside the JVM
  3. [Ext Root Scanning (ms) – How long it took to scan the external (non-heap) roots such as classloaders, JNI references, JVM system roots, etc. Shows elapsed time, “Sum” is CPU time
  4. [Code Root Scanning (ms) – How long it took to scan the roots that came from the actual code: local vars, etc.
  5. [Object Copy (ms) – How long it took to copy the live objects away from the collected regions.
  6. [Termination (ms) – How long it took for the worker threads to ensure that they can safely stop and that there’s no more work to be done, and then actually terminate
  7. [Termination Attempts – How many attempts worker threads took to try and terminate. An attempt is failed if the worker discovers that there’s in fact more work to be done, and it’s too early to terminate.
  8. [GC Worker Other (ms) – Other miscellaneous small activities that do not deserve a separate section in the logs.
  9. GC Worker Total (ms) – How long the worker threads have worked for in total
  10. [GC Worker End (ms) – The timestamp at which the workers have finished their jobs. Normally they should be roughly equal, otherwise it may be an indication of too many threads hanging around or a noisy neighbor

For miscellaneous other records in the log:

[Other: 0.4 ms]1
    [Choose CSet: 0.0 ms]
    [Ref Proc: 0.2 ms]2
    [Ref Enq: 0.0 ms]3
    [Redirty Cards: 0.1 ms]
    [Humongous Register: 0.0 ms]
    [Humongous Reclaim: 0.0 ms]
    [Free CSet: 0.0 ms]4

  1. [Other: 0.4 ms] – Miscellaneous other activities, many of which are also parallelized
  2. [Ref Proc: 0.2 ms] – The time it took to process non-strong references: clear them or determine that no clearing is needed.
  3. [Ref Enq: 0.0 ms] – The time it took to enqueue the remaining non-strong references to the appropriate ReferenceQueue
  4. [Free CSet: 0.0 ms] – The time it takes to return the freed regions in the collection set so that they are available for new allocations.

To familiarize yourself with the other phases G1 collector goes through, check out the GC handbook, the sections about concurrent marking and mixed mode evacuation pauses cover more details in this regard.

ADD COMMENT

Comments

thankyou for this informaton

naveen yadav