NSOperation: Simple, Parallel, Cocoa
- Dataflow Programming / Concurrent Operations
- Operation Types / Adding Dependencies
- Cancelling Operations / Returning to Synchronous Behavior
NSOperation is one of the most-often-overlooked parts of Cocoa. This fact is understandable. It debuted in OS X 10.5, with an implementation so painfully buggy that the common response to people asking why their code didn't work was, "You used NSOperation." A few bug fixes followed in point releases, and by OS X 10.6 NSOperation was quite stable. But OS X 10.6 also introduced libdispatch, which provided a lighter way of doing a lot of the same things as you could with NSOperation.
Since OS X 10.6 was released, I've seen quite a few people implement an ad-hoc subset of NSOperation's functionality on top of libdispatch, not realizing that this functionality was already available out of the box. In fact, in OS X 10.6, NSOperationQueue uses libdispatch to handle scheduling.
I wrote a bit about operations in Cocoa Programming Developer's Handbook, but it's worth revisiting this corner of Cocoa, especially since the new MacBook Pros are coming out with quad-core CPUs (eight hyperthreading contexts).
Dataflow Programming
At its core, NSOperation lets you write a set of discrete computational steps and run them in isolation. More importantly, it lets you establish dependencies between them, so one task will run when a set of others has finished. This model doesn't fit all applications, but you can get very good throughput if it fits yours. NSOperation itself encapsulates one step in a computation, while NSOperationQueue is responsible for running them. You simply add your operations to the queue, and it will run them as soon as all of their prerequisites are finished.
Concurrent Operations
The biggest confusion surrounding NSOperation comes from the distinction between concurrent and non-concurrent operations. If you're using NSOperationQueue to run your operations, they'll always run in parallel as long as the operations' prerequisites are met and you have enough processors.
In the original NSOperation terminology, a concurrent operation had an asynchronous API. It had a -start method that was expected to return quickly. It would spawn a thread, make an asynchronous API call, or perform some equivalent in this method. The operation queue then used key-value observing on the isExecuting and isFinished keys to get a notification when the operation had finished.
The point of this distinction was to reduce the overhead of thread creation. Non-concurrent operations just have a -main method. The original operation queue would execute this method in a separate thread for each operation that was running, while concurrent operations could be multiplexed onto a single thread.
With OS X 10.6, NSOperationQueue uses libdispatch. As soon as an operation is ready to run, NSOperationQueue adds it to a concurrent libdispatch work queue, where the operation will start executing as soon as libdispatch and the kernel agree that enough spare CPU exists to schedule it.
With this change, there is no longer a reason to use concurrent operations-which is very convenient, because they were the most difficult kind of operation to write. A non-concurrent operation just requires you to implement a -main method on an NSOperation subclass, or use one of the existing operation types.