Blocks
With OS X 10.6, Apple introduced a new extension to C, in the form of blocks. These are real closures that can be used in any C, C++, or Objective-C program. They have none of the disadvantages of nested functions, but introduce a disadvantage of their own: They cannot be used transparently in places where C function pointers can.
From the programmer's perspective, blocks are simple. Block types use a caret (^) instead of an asterisk (*), but are otherwise identical to function pointers. For example, int(*)(void) defines a function pointer that takes no arguments and returns an integer, whereas int(^)(void) defines a block pointer that takes no arguments and returns an integer. No explicit arguments, that isthe block still takes a hidden pointer to the block object as the argument, which is why you can't use it as a real function.
When you create a block, it's compiled to a function with the hidden argument, along with a structure representing the block object itself. The structure looks something like this:
struct block_literal { void *isa; int flags; int reserved; void (*invoke)(struct block_literal*, ...); struct block_descriptor *descriptor; };
The isa pointer is for Objective-C compatibility; it points to an Objective-C class, allowing blocks to receive Objective-C messages. The flags indicate some properties about the block, and the reserved field is unused by the compiler.
The invoke field contains a pointer to the function that implements the block. It's called by passing the structure as its argument. You can invoke blocks with a compiler that doesn't understand blocks by calling this function directly, passing the block as the argument. The following two are equivalent:
block(); ((struct block_literal*)block)->invoke(block);
The first requires compiler support, and the second doesn't. We use macros for doing this in GNUstep, allowing us to implement methods that take blocks as arguments without breaking support for GCC.
The final field contains metadata about the block. This is most commonly used when the block is copied or destroyed.
When you first create a block, the structure is allocated on the stack. This is very fast, but it means that the block can't be passed up the stack. When you want to keep a pointer to the block, you must call the _Block_copy() function, which copies the block to the heap, if it's passed an on-stack block. When it's passed an on-heap block or a global block, it increments the reference count and returns the original.
Each call to _Block_copy() must have a matching call to _Block_release(), which decrements the block's reference count and frees it if it reaches zero. That's fairly simple for the block itself, but what about the other variables? Consider this function:
int(^)(void) getCounter(int step) { __block int c = 0; int(^counter)(void) = int(^)(void) { c += step; return c-step; }; return _Block_copy(counter); }
This returns a block that refers to two variables. Both are primitive values, so this is the simplest case. Blocks can have two kinds of external references. The step variable is not qualified, so it's simply copiedthe block has its own private copy, at the end of the block function. The c variable has the __block storage qualifier, meaning that it's passed by reference into the block, via another structure. Therefore, multiple blocks can refer to c (although they don't in this case), and they'll all refer to the same version.
Copying step is trivial; it's immutable, so it's simply copied as part of the block; and because it's primitive, it can be copied with memcpy() or some equivalent. Copying c is harder. It's mutable and (potentially) aliased. Behind the scenes, a byref structure will be created, containing (among other things) a pointer to the on-stack version. The block structure will contain a pointer to this byref structure. The block descriptor will contain a pointer to an automatically generated "copy helper" function, which will call a blocks runtime function for copying it. Confused yet?
When you call _Block_copy(), the blocks runtime will look at the descriptor, see that it contains a copy helper, and then call the copy helper. This copy helper will call another (semi-private) blocks runtime function to copy the byref structure. This function will promote the block to the heap and will update the forwarding pointer in the original, so that any changes to the original now update the on-heap version.
The copy helpers can become quite complicated, especially when C++ gets involved. A variable marked __block may be a complex C++ object. In this case, the copy helper will invoke the object's copy constructor when copying it to the heap. Similarly, a "dispose helper" will call the object's destructor.