Garbage Collection
Managed memory allocations are automatically reclaimed through a garbage collection algorithm. The CLR tracks the use of memory that is allocated on the managed heap, and any memory that is no longer referenced is marked as "garbage." When memory is low, the CLR traverses its data structure of tracked memory and reclaims all the memory marked as garbage. Thus the programmer is relieved of this responsibility.
While this prevents memory leaks in the managed heap, it does not help with the reclamation of other types of allocated resources. Examples of these resources include open files or database connections, or server login connections that have to be disconnected. The programmer may need to write explicit code to perform cleanup of these other resources. This can be done in your class destructor19 or in a specialized cleanup method. The CLR calls your destructor when the memory allocated for an object is reclaimed.
Another concern with garbage collection is performance. There is some overhead associated with automatic garbage collection. However, the CLR does provide an efficient multi-generational garbage collection algorithm.
Object Destruction
Unless you explicitly use the delete operator on a managed object, destruction time is non-deterministic. The destructor for a particular unreferenced object may run at any time during the garbage collection process, and the order of calling destructors for different objects cannot be predicted. Moreover, under exceptional circumstances, a destructor may not run at all (for example, a thread goes into an infinite loop or a process aborts without giving the runtime a chance to clean up). Also, unless you explicitly use the delete operator, the thread on which the destructor is called is not deterministic.
The fact that the call to the destructor is synchronous, and therefore deterministic, when you explicitly delete a managed pointer is demonstrated in the ExplicitDelete example. The following code shows two objects being created. The first one is then passively finalized by assigning it to zero. The garbage collector will call the destructor on its own thread asynchronously. The second object is explicitly destroyed with the delete operator, and the destructor is called synchronously. The program displays details on what is happening to each object, and on what thread, using hash codes. From the output, you can see that with the passively disposed object, the destructor is run on a different thread than the Main method thread. In contrast, you can see that with the explicitly deleted object, the destructor is run on the same thread as the Main method.
//ExplicitDelete.h using namespace System::Threading; public __gc class SomeClass { public: ~SomeClass() { Console::Write( "Destructor running in thread: {0}, ", __box(Thread::CurrentThread->GetHashCode())); Console::WriteLine( "Destroying object: {0}", __box(this->GetHashCode())); } }; public __gc class ExplicitDelete { public: static void Main() { Console::WriteLine( "Main thread: {0}", __box(Thread::CurrentThread->GetHashCode())); SomeClass *sc = new SomeClass; Console::WriteLine( "Main thread creating object: {0}", __box(sc->GetHashCode())); Console::WriteLine( "Nulling pointer to object: {0}", __box(sc->GetHashCode())); sc = 0; GC::Collect(); GC::WaitForPendingFinalizers(); sc = new SomeClass; Console::WriteLine( "Main thread creating object: {0}", __box(sc->GetHashCode())); Console::WriteLine( "Deleting pointer to object: {0}", __box(sc->GetHashCode())); delete sc; Console::WriteLine("All done."); } };
Here is the output.
Main thread: 2 Main thread creating object: 5 Nulling pointer to object: 5 Destructor running in thread: 6, Destroying object: 5 Main thread creating object: 7 Deleting pointer to object: 7 Destructor running in thread: 2, Destroying object: 7 All done.
To avoid unnecessary overhead, you should not implement a destructor for a class unless you have a good reason for doing so. And if you do provide a destructor, you should probably provide an alternate, deterministic mechanism for a class to perform necessary cleanup. The .NET Framework recommends a Dispose design pattern for deterministic cleanup, which is described next.
Unmanaged Resources and Dispose
Consider an object that has opened a file, and is then no longer needed and marked for garbage collection. Eventually, the object's destructor will be called, and the file could be closed in that destructor. But, as we discussed, garbage collection is non-deterministic, and the file might remain open for an indefinitely long time. It would be more efficient to have a deterministic mechanism for a client program to clean up the object's resources when it is done with it. The .NET Framework recommends the IDisposable interface for this purpose.
public __gc __interface IDisposable { void Dispose(); };
This design pattern specifies that a client program should call Dispose on the object when it is done with it. In the Dispose method implementation, the class does the appropriate cleanup. As backup assurance, the class should also implement a destructor in case Dispose never gets called, perhaps due to an exception being thrown.20 Since both Dispose and the destructor perform the cleanup, the cleanup code can be placed in Dispose, and the destructor can be implemented by calling Dispose. The Dispose method is designed such that a client program can call it when it is done with the object or knows that it is safe to free the resources associated with the object.
One detail is that once Dispose has been called, the object's destructor should not be called, because that would involve cleanup being performed twice. The object can be removed from the garbage collection queue by calling GC::SuppressFinalize. Also, it is a good idea for the class to maintain a bool flag such as disposeCalled, so that if Dispose is called twice, cleanup will not be performed a second time.
A Dispose method should also call the base class Dispose to make sure that all its resources are freed. It should also be written so that if a Dispose method is called after the resources have been already freed, no exception is thrown.
Since finalization is expensive, any objects that will no longer acquire any more resources should call the static method GC::SupressFinalize and pass it the this pointer. If you have a try/finally block in your code, you can place a call to the object's Dispose method in the finally block to make sure that resources are freed.
The example program DisposeDemo provides an illustration of the dispose pattern. The class SimpleLog implements logging to a file, making use of the StreamWriter class.
//SimpleLog.h using namespace System; using namespace System::IO; public __gc class SimpleLog : public IDisposable { private: StreamWriter *writer; String *name; bool disposeCalled; public: SimpleLog(String *fileName) : disposeCalled(false) { name = fileName; writer = new StreamWriter(fileName, false); writer->AutoFlush = true; Console::WriteLine( String::Format("logfile {0} created", name)); } void WriteLine(String *str) { writer->WriteLine(str); Console::WriteLine(str); } void Dispose() { if(disposeCalled) return; writer->Close(); GC::SuppressFinalize(this); Console::WriteLine( String::Format("logfile {0} disposed", name)); disposeCalled = true; } ~SimpleLog() { Console::WriteLine( String::Format("logfile {0} finalized", name)); Dispose(); } };
The class SimpleLog supports the IDisposable interface and thus implements Dispose. The cleanup code simply closes the StreamWriter object. To make sure that a disposed object will not also be finalized, GC::SuppressFinalize is called. The finalizer simply delegates to Dispose. To help monitor object lifetime, a message is written to the console in the constructor, in Dispose, and in the finalizer.21
Here is the code for the test program:
//DisposeDemo.h using namespace System; using namespace System::Threading; public __gc class DisposeDemo { public: static void Main() { SimpleLog *log = new SimpleLog("log1.txt"); log->WriteLine("First line"); Pause(); log->Dispose(); // first log disposed log->Dispose(); // test Dispose twice log = new SimpleLog("log2.txt"); log->WriteLine("Second line"); Pause(); log = new SimpleLog( "log3.txt"); // previous (2nd) log released log->WriteLine("Third line"); Pause(); log = 0; // last log released GC::Collect(); Thread::Sleep(100); } private: static void Pause() { Console::Write("Press enter to continue"); String *str = Console::ReadLine(); } };
The SimpleLog object pointer log is assigned in turn to three different object instances. The first time, it is properly disposed. The second time, log is reassigned to refer to a third object, before the second object is disposed, resulting in the second object becoming garbage. The Pause method provides an easy way to pause the execution of this console application, allowing us to investigate the condition of the files log1.txt, log2.txt, and log3.txt at various points in the execution of the program.
Running the program results in the following output:
logfile log1.txt created First line Press enter to continue logfile log1.txt disposed logfile log2.txt created Second line Press enter to continue logfile log3.txt created Third line Press enter to continue logfile log3.txt finalized logfile log3.txt disposed logfile log2.txt finalized logfile log2.txt disposed
After the first pause, the file log1.txt has been created, and you can examine its contents in Notepad. If you try to delete the file, you will get a sharing violation, as illustrated in Figure 82.
Figure 82 Trying to delete an open file results in a sharing violation.
At the second pause point, log1.txt has been disposed, and you will be allowed to delete it. The log2.txt file has been created (and is open). At the third pause point, log3.txt has been created. But the object reference to log2.txt has been reassigned, and so there is now no way for the client program to dispose of the second object.22 If Dispose were the only mechanism to clean up the second object, we would be out of luck. Fortunately, the SimpleObject class has implemented a destructor, so the next time garbage is collected, the second object will be disposed of properly. We can see the effect of finalization by running the program through to completion. The second object is indeed finalized, and thence disposed. In fact, as the application domain shuts down, the destructor is called on all objects not exempt from finalization, even on objects that are still accessible.
In our code we explicitly make the third object inaccessible by the assignment log = null, and we then force a garbage collection by a call to GC::Collect. Finally, we sleep briefly to give the garbage collector a chance to run through to completion before the application domain shuts down. Coding our test program in this way is a workaround for the fact that the order of garbage collection is non-deterministic. The garbage collector will be called automatically when the program exits and the application domain is shut down. However, at that point, system objects, such as Console, are also being closed. Since you cannot rely on the order of finalizations, you may get an exception from the WriteLine statement within the finalizer. The explicit call to GC::Collect forces a garbage collection while the system objects are still open. If we omitted the last three lines of the Main method, we might well get identical output, but we might also get an exception.
Alternate Name for Dispose
The standard name for the method that performs cleanup is Dispose. The convention is that once an object is disposed, it is finished. In some cases, the same object instance may be reused, as in the case of a file. A file may be opened, closed, and then opened again. In such a case the standard naming convention is that the cleanup method should be called Close. In other cases some other natural name may be used.
Our SimpleLog class could plausibly have provided an Open method, and then it would have made sense to name our cleanup method Close. For simplicity, we did not provide an Open method, and so we stuck to the name Dispose.
Generations
As an optimization, every object on the managed heap is assigned to a generation. A new object is in generation 0 and is considered a prime candidate for garbage collection. Older objects are in generation 1. Since such an older object has survived for a while, the odds favor its having a longer lifetime than a generation 0 object. Still older objects are assigned to generation 2 and are considered even more likely to survive a garbage collection. The maximum generation number in the current implementation of .NET is 2, as can be confirmed from the GC::MaxGeneration property.
In a normal sweep of the garbage collector, only generation 0 will be examined. It is here that the most likely candidates are for memory to be reclaimed. All surviving generation 0 objects are promoted to generation 1. If not enough memory is reclaimed, a sweep will next be performed on generation 1 objects, and the survivors will be promoted. Then, if necessary, a sweep of generation 2 will be performed, and so on up until MaxGeneration.
Finalization and Stack Unwinding
As mentioned earlier, one of the virtues of the exception-handling mechanism is that as the call stack is unwound in handling the exception, local objects go out of scope and so can get marked for finalization. The program FinalizeStackUnwind provides a simple illustration. It uses the SimpleLog class discussed previously, which implements finalization.
//FinalizeStackUnwind.h using namespace System; using namespace System::Threading; public __gc class FinalizeStackUnwind { public: static void Main() { try { SomeMethod(); } catch(Exception *e) { Console::WriteLine(e->Message); } GC::Collect(); Thread::Sleep(100); } private: static void SomeMethod() { // local variable SimpleLog *alpha = new SimpleLog("alpha.txt"); // force an exception throw new Exception("error!!"); } };
A local pointer variable alpha of type SimpleLog* is allocated in SomeMethod. Before the method exits normally, an exception is thrown. The stack-unwinding mechanism of exception handling detects that alpha is no longer accessible, and so is marked for garbage collection. The call to GC::Collect forces a garbage collection, and we see from the output of the program that finalization is indeed carried out.
logfile alpha.txt created error!! logfile alpha.txt finalized logfile alpha.txt disposed
Controlling Garbage Collection with the GC Class
Normally, it is the best practice simply to let the garbage collector perform its work behind the scenes. Sometimes, however, it may be advantageous for the program to intervene. The System namespace contains the class GC, which enables a program to affect the behavior of the garbage collector. We summarize a few of the important methods of the GC class.
SuppressFinalize
This method requests the system to not finalize (i.e., not call the destructor) for the specified object. As we saw previously, you should call this method in your implementation of Dispose to prevent a disposed object from also being finalized.23
Collect
You can force a garbage collection by calling the Collect method. An optional parameter lets you specify which generations should be collected. Use this method sparingly, since normally the CLR has better information on the current state of memory. A possible use would be a case when your program has just released a number of large objects, and you would like to see all this memory reclaimed right away. Another example was provided in the previous section, where a call to Collect forced a collection while system objects were still valid.
MaxGeneration
This property returns the maximum number of generations that are supported.
GetGeneration
This method returns the current generation number of an object.
GetTotalMemory
This method returns the number of bytes currently allocated (not the free memory available, and not the total memory size of the heap). A parameter lets you specify whether the system should perform a garbage collection before returning. If no garbage collection is done, the indicated number of bytes is probably larger than the actual number of bytes being used by live objects.
Sample Program
The program GarbageCollection illustrates using these methods of the GC class. The example is artificial, simply illustrating object lifetime and the effect of the various GC methods. The class of objects that are allocated is called Member. This class has a String property called Name. Write statements are provided in the constructor, Dispose, and destructor. A Committee class maintains an array list of Member instances. The RemoveMember method
ly removes the member from the array list. The DisposeMember method also calls Dispose on the member being expunged from the committee. The ShowGenerations method displays the generation number of each Member object. GarbageCollection.h is a test program to exercise these classes, showing the results of various allocations and deallocations and the use of methods of the GC class. The code and output should be quite easy to understand.
All the memory is allocated locally in a method named DemonstrateGenerations. After this method returns and its local memory has become inaccessible, we make an explicit call to GC::Collect. This forces the destructors to be called before the application domain shuts down, and so we avoid a possible random exception of a stream being closed when a WriteLine method is called in a finalizer. This is the same point mentioned previously for the earlier examples.