Text Processing with Objective-C, Part 2
- Faades for Integration
- Walking the Tree
- Generating HTML
- Syntax Highlighting
- Autoreleasing Performance
- Final Thoughts
In the last article, I looked at some of the Cocoa patterns that were used parsing the LaTeX sources from my latest book, Objective-C Phrasebook. This week I'll look at the ones that were used generating the HTML.
Façades for Integration
Internally, this code uses the EtoileText framework, which uses a model inspired by a DOM tree and the existing NSAttributedString class from the Foundation framework. It handles text in a tree, with arbitrary attributes in any node and the text in leaf nodes. This allows things like chapters, sections, and so on to be stored as groupings in the tree.
Sometimes, it's useful to be able to do a quick preview of text stored in this kind of structured formor even edit it in something like a WYSIWYG editor. Cocoa provides NSTextView for this, but it expects the text to be stored in an NSTextStorage instance.
The NSTextStorage class is a subclass of NSMutableAttributedString and expects to contain presentation markupfor example, something saying this is 14-point size bold text, rather than a section heading.
The solution to this mismatch is to create a façade class. NSTextStorage is an abstract class, implemented as a class cluster. It has a small collection of primitive methods that subclasses must implement, and all of the other methods are implemented in terms of these.
These methods are:
(NSString*)string; (void)replaceCharactersInRange: (NSRange)aRange withString: (NSString*)str; (NSDictionary*)attributesAtIndex: (NSUInteger)index effectiveRange: (NSRangePointer)aRange; (void)setAttributes: (NSDictionary*)attributes range: (NSRange)aRange;
Subclassing NSTextStorage and implementing these methods provides a very easy way of integrating some nonstandard text storage mechanism into the Cocoa text system. The first two of these methods deal with the character data. The first of these in EtoileText actually returns another façade class, which implement the two primitive methods in NSString to directly access the character data in the text tree. The second traverses the tree to find the leaf node or nodes containing the range, deletes any intermediate nodes, and inserts the text at the start of the new one.
The remaining two methods are related to attributes on the text. Because EtoileText doesn't use presentation attributes internally, it walks the tree from root to leaf to build a set of presentation attributes, just as CSS attributes are constructed in an HTML document.
The class cluster is a very powerful idea in Cocoa. It provides a lot of flexibility in the underlying implementation. This example shows how it can be used to make one data representation fit into a set of classes designed for another one, but it's also useful for optimization. The Cocoa collection classes are also class clusters. The standard versions are fairly efficient, but it's also possible to provide your own versions for specialized usage. For example, if you are doing a lot of concatenation of immutable arrays, you may implement an NSArray subclass that uses skip lists to traverse a group of existing NSArrays.
I've also seen code that implemented a custom NSDictionary class containing a single pair and a link to another dictionary, optimized for a usage where most dictionaries contained only one pair, but could be combined easily.