Fun with the Objective-C Runtime
One of the nicest things about Objective-C is that there is no magic. This is something it inherits from Smalltalk. In a typical Smalltalk environment, such as Smalltalk-80 or a modern version like Squeak, the virtual machine is written in a subset of Smalltalk. This is then compiled to native code, and the rest of the environment is written in interpreted (or just-in-time-compiled) Smalltalk. With Objective-C, the subset used to implement the dynamic aspects is called C. You have full access from Objective-C code to a great many aspects of the runtime model, and can inspect and modify them.
In a classical Objective-C system, like the old NeXT runtime or the GNU version, classes were instances of a C structure declared in a header. This structure could be modified just like any other C structure. You needed to be careful when doing this, and be aware of how the runtime system used the structures, but everything was present that you needed to construct new classes, add or remove methods from a class, introspect instance variables, and so on.
With Leopard, Apple introduced a new runtime library. Leopard shipped with both — the legacy (NeXT) library and the modern (Apple) library. If you compile code that targets Tiger or earlier, you get the NeXT library, but if you only target Leopard and later, or you use 64-bit mode, you get the new library.
To ease porting, Apple introduced a new set of public APIs for interacting with the runtime. The class structure is now an opaque type, and you use the same set of functions to manipulate it whether you're using the NeXT or Apple runtime. You can now use these same APIs on other platforms with the GNU runtime if you link against the Étoilé Objective-C 2 compatibility framework.
Because there is now a portable, well-supported set of APIs (rather than an ad hoc collection of public data structures) for interacting with the runtime, it becomes a much more interesting thing to do.
Key-Value Coding
Several things in the Cocoa frameworks interact with the Objective-C runtime and then hide this interaction behind Objective-C methods. Much of the code in NSObject is responsible for ensuring that the average programmer doesn't have to deal with the runtime directly. One of the most obvious places where the low-level runtime interactions are hidden is in key-value coding (KVC).
The KVC mechanism allows you to hide how a property is accessed. When you send a -valueForKey: message to a class, it may be reading the value directly from an instance variable, calling an accessor, or calling some fallback code in -valueForUndefinedKey: if neither the instance variable nor the accessor method is available.
While this looks a bit like magic, it isn't. You can do exactly the same thing in your code. As a slightly contrived example, you might want to construct an atomic version of something like KVC that will lock a property and then access it. Let's start by defining an interface. We want to add a method to NSObject with a signature like this:
- (id)atomicallyReadKey: (NSString*)aKey setToNewValue: (id)anObject;
This should use KVC to get and set the value for a given key. During this time, it should lock that property in one of three ways:
- If the object declares an instance variable with the key name and the suffix _lock, we should synchronize on the instance variable.
- If the object declares a -lock{key} and -unlock{key} method, we should call those methods.
- Otherwise, we should synchronize on the object containing the key.
We'll look at how to do this for each case. Before we start, let's define a simple convenience function that does the real test-and-set operation, getting the old value and returning the new one. This function assumes that the locking is done outside, and so is called for each of the three cases:
static inline id testAndSetValue(id object, NSString *key, id value) { id result = [[[object valueForKey: key] retain] autorelease]; [object setValue: value forKey: key]; return result; }
Case 1: Dedicated Instance Variable
Let's consider the first case, when we have a dedicated instance variable for the lock. We get the instance variable in three steps:
Construct a string containing the instance variable name. We do this by appending the _lock suffix to the key:
NSString *ivarName = [aKey stringByAppendingString: @"_lock"];
Look up the instance variable. The class_getInstanceVariable() function takes a class and a C string containing the instance variable name as arguments. The return value is of the opaque type Ivar. This is a pointer to a structure containing the metadata about the instance variable. On the NeXT runtime, this structure is visible and contains the name (selector), types, and IMP for the method. For the modern runtime, it also contains a few other things, such as the alignment of the instance variable, but is private and should only be accessed via other functions. You can cast this pointer to something if you want to inspect it directly, but then your code isn't guaranteed to work on any future version of OS X.
We get the relevant Ivar like this:
Ivar lockIvar = class_getInstanceVariable(isa, [ivarName UTF8String]);
If the instance variable exists, we get back a pointer; otherwise, we get NULL. The fact that the instance variable exists isn't quite sufficient for our purpose, however; we also need it to be an object. So we test that the type encoding for the instance variable is the same as the type encoding for an object (@).
If the instance variable is of the correct type, we need to access the object to which the instance variable is pointing. Then we use this object as the lock:
if (NULL != lockIvar && strcmp(ivar_getTypeEncoding(lockIvar), @encode(id)) == 0) { id lockObject = object_getIvar(self, lockIvar); @synchronized(lockObject) { return testAndSetValue(self, aKey, anObject); } }
The ivar_getTypeEncoding() function just returns a C string representing the type encoding of the instance variable. We need to perform a simple string comparison to see if the returned encoding is the same as the encoding of an object pointer. We can get the value of object instance variables easily by using object_getIvar(). For other types, we need to use ivar_getOffset() to return a value that can be used to find a pointer to the instance variable.
We could have written this example in the following form — more general, but far less readable:
id lockObject = *(id*)(((char*)self)+ivar_getOffset(lockIvar));
Case 2: Object-Declared Lock/Unlock Methods
If case 1 failed (there's no dedicated instance variable for the lock), we next try calling -lock{ivar} and -unlock{ivar}. First, we need to construct a selector for these two methods. KVC does some clever substitutions (for example, capitalizing the key), but for this simple example we just need to prepend lock and unlock:
NSString *lockMethodName = [@"lock" stringByAppendingString: aKey]; NSString *unlockMethodName = [@"unlock" stringByAppendingString: aKey]; SEL lockMethodSel = NSSelectorFromString(lockMethodName); SEL unlockMethodSel = NSSelectorFromString(unlockMethodName);
So far, this is all standard Foundation stuff. In fact, we can do this whole step just using NSObject, with the -respondsToSelector: and -performSelector: methods. For most cases, this technique is sensible, but since the point of this example is to demonstrate how to use the runtime system, we won't use this technique.
There are several ways of testing whether an object responds to a selector. The simplest is to use class_respondsToSelector(), which is the underlying call used to implement -respondsToSelector: in NSObject. Instead, we'll use a slightly more complex version:
Method lockMethod = class_getInstanceMethod(isa, lockMethodSel); Method unlockMethod = class_getInstanceMethod(isa, unlockMethodSel);
This technique is analogous to the function we used earlier to get the instance variable. The return value is another opaque type, this time containing metadata about methods. You can inspect it in a way similar to inspecting the Ivar type. You can also modify it; we'll look at how to do that a bit later.
If this call returns NULL, the method didn't exist. If it returned a valid value, we need to call the method. Here's the simplest way:
if (lockMethod != NULL && unlockMethod != NULL) { objc_msgSend(self, lockMethodSel); id result = testAndSetValue(self, aKey, anObject); objc_msgSend(self, unlockMethodSel); return result; }
Alternatively, we could use method_getImplementation() to get the IMP from the method and call that:
method_getImplementation(lockMethod)(self, lockMethodSel);
It may be tempting to think that you can avoid the double call this way:
class_getMethodImplementation(self, lockMethodSel);
This returns an IMP, and you might think that it would be a null pointer if the object didn't respond to the method. Unfortunately, this isn't the case. If the object doesn't respond to the message, the return value will be a pointer to the private runtime function that's responsible for deconstructing the stack frame for forwarding.
Case 3: Synchronize on the Object Containing the Key
Finally, if all of the preceding strategies failed, we just lock on the object:
@synchronized(self) { return testAndSetValue(self, aKey, anObject); }
As I said earlier, this is quite a contrived example. Unless you're using these accessors everywhere, you're not guaranteeing that some other thread won't change the value, ignoring your locking. It's also sufficiently slow that you're likely to offset a significant amount of the performance gain that you get from multithreaded code.
That's not to say that this kind of runtime introspection is not useful. I used a similar mechanism in an XML parser, for example. Each XML element is parsed by a separate object, which calls a method in the parent parser object with a name generated from the tag name, allowing very simple classes to pass complex tag structures. It's not as fast as hard-coding everything, but the bottleneck in that code is the I/O, so speed is not an issue.