Automatic Reference Counting in Objective-C, Part 1
- The Garbage Collection Sidetrack
- Moving Autorelease into the Language
- A New Memory Model
- Method Families
- Pointer Arguments
In the beginning, Objective-C had a class called Object. This had a method called +new, which wrapped malloc(), and a method called -free, which wrapped free(). This was problematic because Objective-C objects were frequently aliased, and managing object lifecycles became complex.
NeXT extended this with NSObject to provide reference counting. Object pointers were then divided into two categories: owning references and non-owning pointers. An owning pointer is one that that counts towards the object's reference count. If you are sure that a reference is going to be held somewhere else for the duration of a variable's lifetime, you can use a non-owning pointer and avoid the overhead of the reference count manipulation.
Non-owning references are often used for autoreleased values. Autorelease pools allow you to return a non-owning reference to a temporary object. When you send an -autorelease message to an object, you add it to a list that will be deallocated at some later point, when the current autorelease pool is destroyed.
At this year's Worldwide Developers Conference, Apple introduced Automatic Reference Counting (ARC). This makes the compiler do all of this stuff for you. It also makes a few tweaks to the language model as a whole. ARC is supported by OS X 10.7, iOS 5, and GNUstep (using clang and version 1.5 or later of the GNUstep Objective-C runtime).
The Garbage Collection Sidetrack
You may remember that Apple previously tried to move away from reference counting with OS X 10.5. At the time, I was quite critical of their design. In my mind, it did several things wrong. First, it tried to shoehorn garbage collection into C, not just the object part. NSAllocateCollectable() was something that should never have been allowed in the language. The existence of NSScannedOption, which returned a block of memory that might contain pointers (but might contain something else) meant that it would not be possible to use accurate garbage collection, only conservative garbage collection.
This was made worse by the fact that you could store object pointers anywhere, including C structures and arrays. This made it very difficult to verify whether code was actually safe. If someone gave you a pointer, you could safely write an object pointer into the memory if it had been allocated with NSAllocateCollectable() and NSScannedOption, but not if it had been allocated without NSScannedOption, or with malloc(). This meant that migrating to garbage collection was a huge pain. The nondeterminism made it worse—you could have bugs like this that only showed up if something triggered a collection.
The second problem was that it didn't interact with manual retain/release code at all. You had to recompile your code to use GC. Not only that, but you also had to recompile all of the code that you linked against. Apple recompiled most of their code—although some frameworks contained bugs in GC mode—but if you depended on someone else's framework, then you needed to recompile it. And, of course, it wasn't a straight recompile; you needed to migrate the code over to using GC.
I had a chance to chat to Chris Lattner, head of Apple's Compiler group, about this at FOSDEM this year. He listened to all of my criticisms, and told me to wait until the summer. On my birthday this year, I got a nice present from him: the public release into the LLVM and Clang repositories of code that implemented almost exactly what I'd told him I wanted, in terms of memory management for Objective-C. Apparently I wasn't the only person who didn't like Objective-C garbage collection.