- Runtime Type Information
- Exception Handling
- Static Initialization
Static Initialization
In C, you can only initialize static variables with compile-time constant expressions. This limitation exists because they're implemented by emitting the initialization value into a section in the executable file, which is then either copied or mapped into the process' address space when the program runs. In C++, this limitation doesn't exist. You can do things like this at the global scope:
FILE *myFile = fopen("someFile", "r");
This involves a system call, and the result contains a reference to a kernel resource, so there's no way that the compiler can evaluate this at compile time and store the result in the concluding binary. Instead, the C++ compiler will emit a private function that performs the initialization. This function will then be added to the global constructors section in the resulting binary, and the loader will call every function there. For C programmers, GCC and compatible compilers provide the same functionality via __attribute__((constructor)). A function declared with this attribute will be called by the loader. It's somewhat dangerous to use, because the order in which they will be calledespecially between compilation unitsis undefined, so it's usually better to have an explicit initialization function.
This system works fine for variables declared in source files, but what about static variables declared in templates or functions in header files? The compiler may emit multiple copies of the variable, expecting the linker to combine them, but it will also need to emit multiple copies of the initialization code. It can't rely on the linker to combine these copies, but it needs to make sure that the variable is initialized. Each file will contain something similar to this:
FILE *myFile; int64_t _ZGV6myFile; __attribute__((constructor)) static void __cxx_ctors(void) { if (__cxa_guard_acquire(&_ZGV6myFile)) { try { myFile = fopen("someFile", "r"); __cxa_guard_release(&_ZGV6myFile) } catch (...) { __cxa_guard_abort(&_ZGV6myFile) } } }
This simple bit of source code expands to some very complex code in the compiler. The linker will combine multiple declarations of myFile and its associated guard variable, _ZGV6myFile, in the Itanium name-mangling scheme, because they're globals in the same scope. It won't combine the __cxx_ctors() functions, because they don't have external linkage.
The guard variable is designed to be very quick in the common case, so it uses the low bit to indicate whether initialization is complete. The rest is typically used to provide a spin lock, using atomic compare and exchange operations.
As you can see, C++ loses one of the biggest advantages of C: the fact that the source code gives a good indication of the runtime complexity. In C++, the same statement can have very different implementations, depending on its typeor even its location.