- Runtime Type Information
- Exception Handling
- Static Initialization
Exception Handling
Nothing in the implementation of C++ is more complicated than exception handlingand I say this as someone who has written the runtime support library code and worked on a couple of C++ compilers. Exceptions in C++ are allowed to unwind the stack. This means that every stack frame must be given the chance to run destructors for variables with automatic storage and to catch exceptions. It must also be allowed to block exception propagation, if an exception specification is violated.
A few difficulties are involved here. The first is unwinding the stack. Early implementations used setjmp() and longjmp() for this purpose. Anything that required cleanup or catch code to run would first save the current stored jump buffer onto the stack, and then call setjmp() and store the result in a per-thread buffer. On throwing an exception, it would call longjmp(), which would immediately return to the most recent setjmp() call. This would then run the cleanup code, restore the old jump target, and repeat the process. When the scope exited without throwing an exception, it would restore the old jump buffer without calling longjmp().
This approach was very expensive. Every function that included destructors or catch statements had to run some extra code for every invocation, even if you never threw an exception. Since exceptions are meant to be for exceptional behavior, optimizing for the case where they weren't thrown seemed to make more sense.
Modern systems use so-called "zero-cost exceptions," three common implementations of the same idea: Itanium, ARM, and Win64. The Itanium ABI is used on most Free Software *NIX systems, including OS X (with some small modifications), and on most architectures, not just on Itanium. They all share the same underlying idea.
For every function, the compiler emits some metadata to a special section. In the example in part 1 of this series, this is the following symbol, which was shown by nm:
__ZN5outer5inner8functionEii.eh
In fact, it emitted two sets of tables: one for the generic unwinder, and one for the personality function. The personality function was the final symbol, which was a reference to an external definition:
___gxx_personality_v0
The generic unwinder is shared by all languages. The tables for it need to give the spill location for all callee-save registers (including the return address), so that it can restore the stack frame to the previous state. When you're unwinding through pure C code, this is all that's needed. Other languages (including GNU C and C++) need to be able to perform actions during the unwinding, and that's where the personality routine and the language-specific data area (LSDA) come in.
The personality function is passed the call-site address, the exception, and the address of the LSDA. It's responsible for determining what action (if any) should be taken for this frame. Most implementations use two-phase unwinding, where the first phase identifies the action for each stack frame, and then the second phase actually runs it. This structure allows you to terminate a program immediately on an uncaught exception, without needing to run all of the cleanup code.
The personality function parses the unwind tables, checks whether the exception should be caught, and indicates what the generic unwind framework should do.