- Failure to Distinguish Scalar and Array Allocation
- Checking for Allocation Failure
- Replacing Global New and Delete
- Confusing Scope and Activation of Member new and delete
- Throwing String Literals
- Improper Exception Mechanics
- Abusing Local Addresses
- Failure to Employ Resource Acquisition Is Initialization
- Improper Use of auto_ptr
Gotcha #67: Failure to Employ Resource Acquisition Is Initialization
It's a shame that many newer C++ programmers don't appreciate the wonderful symmetry of constructors and destructors. For the most part, these are programmers who were reared on languages that tried to keep them safe from the vagaries of pointers and memory management. Safe and controlled. Ignorant and happy. Programming precisely the way the designer of the language has decreed that one should program. The one, true way. Their way.
Happily, C++ has more respect for its practitioners and provides much flexibility as to how the language may be applied. This is not to say we don't have general principles and guiding idioms (see Gotcha #10). One of the most important of these idioms is the "resource acquisition is initialization" idiom. That's quite a mouthful, but it's a simple and extensible technique for binding resources to memory and managing both efficiently and predictably.
The order of execution of construction and destruction are mirror images of each other. When a class is constructed, the order of initialization is always the same: the virtual base class subobjects first ("in the order they appear on a depth-first left-to-right traversal of the directed acyclic graph of base classes," according to the standard), followed by the immediate base classes in the order of their appearance on the base-list in the class's definition, followed by the non-static data members of the class, in the order of their declaration, followed by the body of the constructor. The destructor implements the reverse order: destructor body, members in the reverse order of their declarations, immediate base classes in the inverse order of their appearance, and virtual base classes. It's helpful to think of construction as pushing a sequence onto a stack and destruction as popping the stack to implement the reverse sequence. The symmetry of construction and destruction is considered so important that all of a class's constructors perform their initializations in the same sequence, even if their member initialization lists are written in different orders (see Gotcha #52).
As a side effect or result of initialization, a constructor gathers resources for the object's use as the object is constructed. Often, the order in which these resources are seized is essential (for example, you have to lock the database before you write it; you have to get a file handle before you write to the file), and typically, the destructor has the job of releasing these resources in the inverse order in which they were seized. That there may be many constructors but only a single destructor implies that all constructors must execute their component initializations in the same sequence.
(This wasn't always the case, by the way. In the very early days of the language, the order of initializations in constructors was not fixed, which caused much difficulty for projects of any level of complexity. Like most language rules in C++, this one is the result of thoughtful design coupled with production experience.)
This symmetry of construction and destruction persists even as we move from the object structure itself to the uses of multiple objects. Consider a simple trace class:
›› gotcha67/trace.h
class Trace { public: Trace( const char *msg ) : m_( msg ) { cout << "Entering " << m_ << endl; } ~Trace() { cout << "Exiting " << m_ << endl; } private: const char *m_; };
This trace class is perhaps a little too simple, in that it makes the assumption that its initializer is valid and will have a lifetime at least as long as the Trace object, but it's adequate for our purposes. A Trace object prints out a message when it's created and again when it's destroyed, so it can be used to trace flow of execution:
›› gotcha67/trace.cpp
Trace a( "global" ); void loopy( int cond1, int cond2 ) { Trace b( "function body" ); it: Trace c( "later in body" ); if( cond1 == cond2 ) return; if( cond1-1 ) { Trace d( "if" ); static Trace stat( "local static" ); while( --cond1 ) { Trace e( "loop" ); if( cond1 == cond2 ) goto it; } Trace f( "after loop" ); } Trace g( "after if" ); }
Calling the function loopy with the arguments 4 and 2 produces the following:
Entering global Entering function body Entering later in body Entering if Entering local static Entering loop Exiting loop Entering loop Exiting loop Exiting if Exiting later in body Entering later in body Exiting later in body Exiting function body Exiting local static Exiting global
The messages show clearly how the lifetime of a Trace object is associated with the current scope of execution. In particular, note the effect the goto and return have on the lifetimes of the active Trace objects. Neither of these branches is exemplary coding practice, but they're the kinds of constructs that tend to appear as code is maintained.
void doDB() { lockDB(); // do stuff with database . . . unlockDB(); }
In the code above, we've been careful to lock the database before access and unlock it when we've finished accessing it. Unfortunately, this is the kind of careful code that breaks under maintenance, particularly if the section of code between the lock and unlock is lengthy:
void doDB() { lockDB(); // . . . if( i_feel_like_it ) return; // . . . unlockDB(); }
Now we have a bug whenever the doDB function feels like it; the database will remain locked, and this will no doubt cause much difficulty elsewhere. Actually, even the original code was not properly written, because an exception might have been thrown after the database was locked but before it was unlocked. This would have the same effect as any branch past the call to unlockDB: the database would remain locked.
We could try to fix the problem by taking exceptions explicitly into account and by giving stern lectures to maintainers:
void doDB() { lockDB(); try { // do stuff with database . . . } catch( . . . ) { unlockDB(); throw; } unlockDB(); }
This approach is wordy, low-tech, slow, hard to maintain, and will cause you to be mistaken for a member of the Department of Redundancy Department. Properly written, exception-safe code usually employs few try blocks. Instead, it uses resource acquisition is initialization:
class DBLock { public: DBLock() { lockDB(); } ~DBLock() { unlockDB(); } }; void doDB() { DBLock lock; // do stuff with database . . . }
The creation of a DBLock object causes the database lock resource to be seized. When the DBLock object goes out of scope for whatever reason, the destructor will reclaim the resource and unlock the database. This idiom is so commonly used in C++, it often passes unnoticed. But any time you use a standard string, vector, list, or a host of other types, you're employing resource acquisition is initialization.
By the way, be wary of two common problems often associated with the use of resource handle classes like DBLock:
void doDB() { DBLock lock1; // correct DBLock lock2(); // oops! DBLock(); // oops! // do stuff with database . . . }
The declaration of lock1 is correct; it's a DBLock object that comes into scope just before the terminating semicolon of the declaration and goes out of scope at the end of the block that contains its declaration (in this case, at the end of the function). The declaration of lock2 declares it to be a function that takes no argument and returns a DBLock (see Gotcha #19). It's not an error, but it's probably not what was intended, since no locking or unlocking will be performed.
The following line is an expression-statement that creates an anonymous temporary DBLock object. This will indeed lock the database, but because the anonymous temporary goes out of scope at the end of the expression (just before the semicolon), the database will be immediately unlocked. Probably not what you want.
The standard auto_ptr template is a useful general-purpose resource handle for objects allocated on the heap. See Gotchas #10 and #68.