What Language I Use for… Creating Reusable Libraries: Objective-C
When most people think of Objective-C, they think of it as "that language that you use for programming iPhones (or possibly Macs)." I write a lot of Objective-C code, but I run very little of it on an Apple platform.
The original version of Objective-C didn't have a standard library of its own. It was intended for packaging up C libraries into late-bound modules that could be reused, so its standard library was the C standard library. The idea was that you'd use C for the core of the implementation, and then use Objective-C on the edges of the library to expose functionality and for combining library code into an application.
This is increasingly important. Most applications today are more than 90% shared library code. In a sense, this is a very positive reflection on our industry because it means that code reuse is no longer just a good idea but a fact of deployment. It does, however, mean that if you want to stay competitive, you need to think about all the code you write in terms of how it can be reused in the future.
Today, Objective-C does have something akin to a standard library: the Foundation framework (sometimes called the Foundation Kit). This library was standardized in 1992 by Sun and NeXT in the OpenStep specification, and is implemented by Apple on OS X and iOS - and by GNUstep everywhere else. There are also some cut-down implementations of Foundation for systems with smaller resource footprints.
This library includes all the standard features that you'd expect from a modern language: basic data types such as strings; collections (arrays, dictionaries, sets and so on); ways of interacting with the system (file handles, sockets); an event-driven runloop model; and so on.
Don't Miss These Related Articles Also by David Chisnall
- The Benefits of Learning Multiple Programming Languages
- What Language I Use for... Hardware Design: BlueSpec
- What Language I Use for... Building Scalable Servers: Erlang
Learn more about David Chisnall
Interface vs. Implementation
When you're creating a library, the most important consideration (sadly missed by most LLVM developers and, indeed, many open source projects) is that users of your library don't want to have to rewrite their code to be able to use a new version.
In fact, most users don't want to have to recompile their code, either. This even applies to open source software being shipped as binary packages, where it's trivial for the packaging system to recompile everything. As an end user, I don't want to have to download new versions of every package that depends on a particular library just because a new version of the library has been released.
The key to this is ensuring that the library's interface is stable and does not depend on any implementation details that are likely to change over time. C++, for example, is particularly bad at this. Consider the following trivial C++ class interface:
class point { int x, y; virtual int getDistanceFromOrigin(); };
If you add or remove any of the fields, you've changed the class' binary interface so that any subclass or anything that allocates an instance of the class on the stack will need to be recompiled. Worse, if you add any (virtual) methods to the class, you've changed the vtable layout, which is part of the class' binary interface and so will force anything that might contain subclasses of this to be recompiled.
There are a few ways around this in C++. You can have a pure-virtual superclass. This superclass works around the problem of allocating the object on the stack (because it's no longer possible), but it still means that adding a new method will change the ABI. You can avoid that with the pImpl (pointer to implementation) pattern, which uses non-virtual functions for the public interface and has each of those call the corresponding (possibly virtual) function in the implementation object. The public class then just has a single field, which is a pointer to the real implementation.
By this point, you've almost got Objective-C semantics, but you've had to fight the language every step of the way. In contrast, this class in Objective-C would have its implementation and interface written like this:
@interface Point : NSObject - (int)distanceFromOrigin; @end @implementation Point { int x; int y; } - (int)distanceFromOrigin { return sqrt(x*x+y*y); } @end
If you decide to change the class to use double-precision floating-point polar coordinates internally, the interface remains the same. If you want to add some methods (or even reorder the existing ones to make the header more readable), this has no effect on the binary interface.
Late Binding
A related idea to the separation of interface and implementation is that of late binding. The point example applies even if someone subclasses the Point class. The two instance variables are not part of the interface, so they can't be accessed by subclasses unless they go via the introspection mechanisms. Objective-C objects can't be allocated on the stack, and the offsets of instance variables are defined by the runtime library at load time, so you can make a superclass larger or smaller without breaking any of its subclasses.
In Objective-C, when you call a method, it is looked up based on its name (and in the GNUstep implementation, the types of its arguments, which avoids some stack-corruption issues in Apple's version of the language), independently of the class hierarchy. You can override any method in a class or even replace one with a proxy or something with an entirely different implementation.
This makes it very easy to refactor code in significant ways without impacting users of a library. With a small amount of care, you can produce adaptors that implement old interfaces, even when these no longer have any relationship to how the implementation really works.
This also means that the coupling between classes in Objective-C tends to be very loose. It's the only language I've ever used where it is common to write a class for one application and then pull it, unmodified, into another where it interacts with a very different set of classes.
Interfaces to Other Languages
The point of writing a shared library is for people to use it. This means that they must be able to call into your library. There are Objective-C bridges from a number of scripting languages; for example, in Python you can take an Objective-C object, subclass it, and use it as if it were a Python object. The LanguageKit framework that I've written also allows me to compile domain-specific languages and even dialects of Smalltalk and JavaScript to share an object model with Objective-C. The Java Interface to GNUstep (JIGS) allows you to expose Objective-C objects to a JVM.
Perhaps more importantly, however, it's trivial to integrate Objective-C with C and C++. You can automatically generate C wrappers for Objective-C classes, if you need to be callable directly from C and C is the lingua franca of modern languages. It's hard to find a programming language that doesn't have some way of calling C code.
This is actually how the GNUstep implementation of some parts of CoreFoundation works: a set of wrapper functions that were automatically generated to invoke Objective-C methods. Doing this is similar to using the pImpl pattern in C++, but it has the advantage that you need to do it only for languages that don't have a native bridge.
Static Compilation
One of the other advantages of Objective-C over other late-bound object oriented languages is its compilation model. Objective-C code is ahead-of-time compiled and requires only a lightweight runtime library to work.
This is not just an advantage for writing shared libraries; it also makes deploying applications that use them easier. Dependencies on large scripting language systems or virtual machines are often a serious barrier. Especially if you need to depend on a specific version - I've had to use machines with five different versions of Python installed because of different packages requiring different versions, and this quickly becomes an administrative nightmare.
This isn't always an advantage. For example, quite a lot of companies use GNUstep to port iOS applications to Android and, because it requires the use of the NDK, end up with an application that doesn't run on MIPS, restricting them to only 99.9% of the Android market. That's not really a problem today, but it might become one if MIPS (or even x86) smartphones become common. Of course, supporting these platforms is typically just a recompile away...