- When Bad Things Happen to Good Programs
- Guidebook to Memory Mis-Pointing
- Where Are We Now?
In the abstract, the only things that can go wrong with C and C++ memory management are bad pointers and leaked memory. Pointers might be uninitialized; they might be incorrectly initialized (often to NULL); they might point to memory that's no longer safe to use; they might point to wrong datafrequently mistyped data.
Let's start by looking at examples of all of these errors.
Here's an uninitialized pointer:
void purchase() { struct CustomerRecord *purchaser; /* Oops! purchaser hasn't been set yet. */ printf("Hello, %s; as this is your first ...", purchaser->first_name); other_stuff(); }
What was probably intended was something like this:
void purchase() { struct CustomerRecord *purchaser; purchaser = getCurrentPurchaser(); printf("Hello, %s; as this is your first ...", purchaser->first_name); other_stuff(); }
This second form might have other problems, though; perhaps getCurrentPurchaser() returns a special value such as NULL if there is no current purchaseror, more subtly, if a credit card authorization hasn't yet completed. In that case, we'd need this:
void purchase() { struct CustomerRecord *purchaser; purchaser = getCurrentPurchaser(); if (purchaser == NULL) handle_this_case(); else { printf("Hello, %s; as this is your first ...", purchaser->first_name); other_stuff(); } }
One of the difficulties of conventional procedural programming languages is that a variable can, well, vary. Perhaps you examine source code carefully to establish that a variable has the value it should at one pointbut it still might be wrong elsewhere in the source. Pointers exhibit a secondary form of this difficulty: Pointers can have the same value at two different times, but point to different data, or even to non-data, at those times. Here's an example:
void upper() { Thing *this_thing; this_thing = get_one_thing(); do_other_stuff(); puts("Uh oh! By this point, this_thing is likely to point to corrupt data."); } Thing *get_one_thing() { Thing reference_thing; reference_thing.code = 1; reference_thing.type = 2; /* reference_thing, and its address, are perfectly fine--*before* the return. */ return &reference_thing; }
In this case, reference_thing is correct at first. As it's located in the stack, though (at least for conventional compilers), it's susceptible to corruption after a return surrenders the stack.
The corresponding heap memory error is even more common, in my experience:
void upper() { Thing *this_thing; this_thing = get_one_thing(); operate(this_thing); free((void *) this_thing); /* Nope; this_thing's data are no longer safe, after the free(). */ operate(this_thing); } Thing *get_one_thing() { Thing *reference_thing_ptr; reference_thing_ptr = (Thing *) malloc(sizeof Thing); reference_thing_ptr.code = 1; reference_thing_ptr.type = 2; return reference_thing_ptr; }
A final model for memory corruption is mistyped dereferencing. This is the category of the buffer overflows that so often yield exploits. Here's an example:
char *bad_implementation_of_strdup(char *string) { char *ptr; /* Oh no! Do you see the missing "+ 1"? */ ptr = (char *) malloc(strlen(string)); strcpy(ptr, string); return ptr; }
Think of ptr here as the address of a LENGTH-long character array, while ptr has the distinct type of a (LENGTH + 1) array.