6.6 Garbage Collection Pattern
Memory defects are among the most common and yet most difficult to identify errors. They are common because the programming languages provide very low access to memory but do not provide the means to identify when the memory is being accessed properly. This can lead to memory leaks and dangling pointers. The insidious aspect of these defects is that they tend to have global, rather than local, impact, so while they can crash the entire system, they leave no trace as to where the defect may occur. The Garbage Collection Pattern addresses memory access defects in a clean and simple way as far as the application programmer is concerned. The standard implementation of this pattern does not address memory fragmentation (see the Garbage Compactor Pattern to get that benefit), but it does allow the system to operate properly in the face of poorly managed memory.
6.6.1 Abstract
The Garbage Collection Pattern can eliminate memory leaks in programs that must use dynamic memory allocation. Memory leaks occur because programmers make mistakes about when and how memory should be deallocated. The solution offered by the Garbage Collection Pattern removes the defects by taking the programmer out of the loopthe programmer no longer explicitly deallocates memory. By removing the programmer, that source of defects is effectively removed. The costs of this pattern are run-time overhead to identify and remove inaccessible memory and a loss of execution predictability because it cannot be determined at design time when it may be necessary to reclaim freed memory.
6.6.2 Problem
The Garbage Collection Pattern addresses the problem of how we can make sure we won't have any memory leaks. Many high-availability or high-reliabilty systems must function for long periods of time without being periodically shut down. Since memory leaks lead to unstable behavior, it may be necessary to completely avoid them in such systems. Furthermore, reference counting Smart Pointers (see Smart Pointer Pattern, earlier in this chapter) have the disadvantages that they require programmer discipline to use correctly and cannot be used when there are cyclic object references.
6.6.3 Pattern Structure
Figure 6-10 shows the pattern for what is called Mark and Sweep garbage collection. In Mark and Sweep, garbage collection takes place in two phases: a marking phase, followed by a reclamation phase. When objects are created, they are marked as live objects. The marking phase is begun in response to a low memory or an explicit request to perform garbage collection. In the marking phase, each of the root objects is searched to find all live objects. Objects that cannot be reached in this way are marked as dead. In the subsequent sweep phase, all the objects marked as dead are reclaimed. The garbage collector must stop normal processing before performing its duties, reducing the predictability of real-time systems.
Figure 6-10: Garbage Collection Pattern (Mark and Sweep)
6.6.4 Collaboration Roles
Client
The Client is the user-defined object that allocates memory (generally, although not necessarily, in the form of objects). It is a subclass of Collectable and contains pointers to derived objects, allowing the garbage collector to search from the so-called root objects to all derived objects. When created, the object is marked as live with is isLive attribute, inherited from Collectable. On the second pass, all objects not marked as live are removedthat is, added back to the heap free memory.
Collectable
This is the base class for Client, and it provides the isLive Boolean attribute used during the garbage collection process.
Free Block List
A list of free blocks from which requests for dynamic memory are fulfilled.
Garbage Collector
The Garbage Collector manages the reclamation of memory by searching the object space starting with the root objects, looking at all blocks to ensure their liveness, and removing all those that are no longer livein other words, those that cannot be reached in some fashion from a root or derived object.
Heap
The Heap is the owner of all the Memory Blocks and the Free Block List.
Memory Block
The Memory Block is just like it sounds: a block (normally of arbitrary size, in which case it contains a size parameter) of memory, usually, although not necessarily, associated with an object. Memory Blocks may be pointed to by the Free Block List, in which case they are not currently being pointed to by a Client or may be pointed to by a Client, in which case they are not pointed to by the Free Block List. Hence, the {exclusive} constraint on the relations to those classes.
6.6.5 Consequences
This architectural pattern removes the vast majority of memory-related problems by effectively eliminating memory leaks and dangling pointers. It is still possible to do bad pointer arithmetic, but they account for a relatively small number of defects compared to the first two memory-management defects. Further, there is much less need to do pointer arithmetic when memory is collected and managed for you. The use of this pattern removes these user defects by eliminating reliance on the user to correctly deallocate memory.
The garbage collector runs episodically when a "low-memory" condition is detected and deallocates all inaccessible memory. Following garbage collection, all non-NULL pointers and references are valid, and all unreferenced memory is freed. The pattern correctly identi-fies and handles circular references, unlike the Smart Pointer Pattern.
Since this pattern uses a two-pass mark-and-sweep algorithm, it takes a nontrivial amount of time to do a complete memory cleanup. This has two negative consequences. First, considerable processing time and effort may be required to perform the memory reclamation, and it cannot, in general, be predicted how much time and effort will be required. Second, because it is done in response to a low-memory condition (such as a request for memory that cannot be ful-filled), when it occurs is likewise unpredictable. This means that while the approach scales up to large-scale system well in terms of managing complexity, it may not work well in systems with hard real-time constraints.
Another difficulty with this approach is that it does not affect fragmentation, a key problem in systems that must run for long periods of time. Memory will be reclaimed properly, but it will result in fragmented free space. This means that although enough memory may be free to fulfill a request for memory, there may not be a single contiguous block available to fulfill the request. In fact, with this pattern (and most other memory management patterns) fragmentation increases monotonically the longer the system runs. The Garbage Compactor Pattern, described in the next section, addresses this need.
6.6.6 Implementation Strategies
As with all such patterns, the simplest way to use this pattern is to buy it. Some languages, such as Java, provide memory management systems that use garbage collection out of the box. Where such languages are not available, the implementation of such a memory management schema can be done easily in the naïve case and with more difficulty in the more optimized case.
A common optimization is to allow the application objects to explicitly request a garbage collection pass when it is convenient for the application, such as when the application is quiescent. For example, if the concurrency model is managed by a cyclic executive (see the Cyclic Executive Pattern), then at the end of the cycle, if there is sufficient time, a memory cleanup may be performed. If it cannot be guaranteed that the garbage collection will complete before the next cycle occurs, the garbage collector may be preemptable, so that it is stopped prior to completion, allowing the application to run and meet its deadlines. When using such a strategy, be careful that you do not assume that the object marked as live on the previous pass has remained live.
6.6.7 Related Patterns
When the inherent unpredictability of the system cannot be tolerated, another approach, such as the Smart Pointer Pattern or Fixed Size Allocation Pattern, should be used. To eliminate memory fragmentation, the Garbage Compactor Pattern works well. The Static Allocation Pattern does not have fragmentation, and the Fixed Sized Allocation Pattern does its best to minimize fragmentation. The Smart Pointer Pattern cannot handle circular references, but the Garbage Collection and Garbage Compactor Patterns do.
6.6.8 Sample Model
Figures 6-11a, b, and c show instance snapshots of allocated memory. In the figures, Ob1 and Ob2 are root objects, known to the Garbage Collector. These might be, for example, initial instances created in the main() of the application. Objects Ob1a, Ob1b, and Ob1c are derived objects that can be found by traversing the links from Ob1 and Ob2 in Figure 6-11a. In Figure 6-11b, the link from Ob1 to Ob1a is broken. This means that Ob1a and Ob1b are no longer accessible to the system, since they cannot be found through a traversal of links from root objects. Note that Ob1c remains accessible via the link through the root object Ob2. In Figure 6-11c, we see that the memory used by Ob1a and Ob1b is reclaimed, and only accessible objects remain.
Figure 6-11: Garbage Collection Pattern
Figure 6-12 shows how the garbage collector proceeds. First, every object in the heap is marked as dead. Subsequently, each root object is searched. As objects are found, they are marked as live by setting the isLive attribute to TRUE. In the second pass, the garbage collector does a linear search through all the allocated memory blocks, removing those that are still marked dead. This is done by first calling the object's destructor (if one exists) and then adding the object to be deleted to the Free Block List.
Figure 6-12: Garbage Collection Pattern Example Scenario