Parallel GC
Parallel GC is a parallel stop-the-world collector, which means that when a GC occurs, it stops all application threads and performs the GC work using multiple threads. The GC work can thus be done very efficiently without any interruptions. This is normally the best way to minimize the total time spent doing GC work relative to application work. However, individual pauses of the Java application induced by GC can be fairly long.
Both the young and old generation collections in Parallel GC are parallel and stop-the-world. Old generation collections also perform compaction. Compaction moves objects closer together to eliminate wasted space between them, leading to an optimal heap layout. However, compaction may take a considerable amount of time, which is generally a function of the size of the Java heap and the number and size of live objects in the old generation.
At the time when Parallel GC was introduced in HotSpot, only the young generation used a parallel stop-the-world collector. Old generation collections used a single-threaded stop-the-world collector. Back when Parallel GC was first introduced, the HotSpot command-line option that enabled Parallel GC in this configuration was -XX:+UseParallelGC.
At the time when Parallel GC was introduced, the most common use case for servers required throughput optimization, and hence Parallel GC became the default collector for the HotSpot Server VM. Additionally, the sizes of most Java heaps tended to be between 512MB and 2GB, which keeps Parallel GC pause times relatively low, even for single-threaded stop-the-world collections. Also at the time, latency requirements tended to be more relaxed than they are today. It was common for Web applications to tolerate GC-induced latencies in excess of one second, and as much as three to five seconds.
As Java heap sizes and the number and size of live objects in old generation grew, the time to collect the old generation became longer and longer. At the same time, hardware advances made more hardware threads available. As a result, Parallel GC was enhanced by adding a multithreaded old generation collector to be used with a multithreaded young generation collector. This enhanced Parallel GC reduced the time required to collect and compact the heap.
The enhanced Parallel GC was delivered in a Java 6 update release. It was enabled by a new command-line option called -XX:+UseParallelOldGC. When -XX:+UseParallelOldGC is enabled, parallel young generation collection is also enabled. This is what we think of today as Parallel GC in HotSpot, a multithreaded stop-the-world young generation collector combined with a multithreaded stop-the-world old generation collector.
Parallel GC is a good choice in the following use cases:
Application throughput requirements are much more important than latency requirements.
A batch processing application is a good example since it is noninteractive. When you start a batch execution, you expect it to run to completion as fast as possible.
If worst-case application latency requirements can be met, Parallel GC will offer the best throughput. Worst-case latency requirements include both worst-case pause times, and also how frequently the pauses occur. For example, an application may have a latency requirement of “pauses that exceed 500ms shall not occur more than once every two hours, and all pauses shall not exceed three seconds.”
An interactive application with a sufficiently small live data size such that a Parallel GC’s full GC event is able to meet or beat worst-case GC-induced latency requirements for the application is a good example that fits this use case. However, since the amount of live data tends to be highly correlated with the size of the Java heap, the types of applications falling into this category are limited.
Parallel GC works well for applications that meet these requirements. For applications that do not meet these requirements, pause times can become excessively long, since a full GC must mark through the entire Java heap and also compact the old generation space. As a result, pause times tend to increase with increased Java heap sizes.
Figure 1.1 illustrates how the Java application threads (gray arrows) are stopped and the GC threads (black arrows) take over to do the garbage collection work. In this diagram there are eight parallel GC threads and eight Java application threads, although in most applications the number of application threads usually exceeds the number of GC threads, especially in cases where some application threads may be idle. When a GC occurs, all application threads are stopped, and multiple GC threads execute during GC.
Figure 1.1 How Java application threads are interrupted by GC threads when Parallel GC is used