- Parsing LaTeX
- Building the Tokenizer
- Iterating Over an NSString
- Building the Parser
Building the Parser
Most parsers recognize a fairly well-defined language, with something like a grammar describing it. For my LaTeX sources, the language is quite flexible and gets added to periodically. For example, in this book I use a \class macro. If I do \class{NSObject}, then it adds 'NSObject class' to the index, with the current page on it, and inserts the word NSObject in the text at the current point in the font used for code listings. I had similar commands for protocols and C types.
These commands are defined in LaTeX in the preamble. Because TeX is a programming language, it's easy to extend in this way. The TeX parser needed a similar mechanism for extension. I did this by providing one class for each decoder. This is a fairly simple way of implementing flexible parsers. It's not very efficient, and I thought that I'd have to spend a while optimizing it after I'd finished. My first test run over the entire book took approximately 0.1 seconds of CPU time for parsing, so I decided that it was probably in the category of “fast enough.”
Extending the parser is quite simple. I wanted to make it possible to extend it without needing to recompile the framework, so I used some of the dynamic features of Objective-C. The book is described by a property list file, which contains a dictionary of macro name to class name mappings. When it loads the plist, the code calls NSClassFromString() on each of these class names to get the parser class to instantiate whenever a new command is encountered in the token stream.
This is a very common pattern when developing plugins in Objective-C. You can extend it even more by providing a list of selectors and using NSSelectorFromString() to map them. When you load a property list file, you get one of the standard Cocoa collections, typically a dictionary. You can then create a dictionary of classes from their names easily, like this:
NSDictionary *plist = [NSDictionary dictionaryWithContentsOfFile: pluginPath]; NSMutableDictionary *handlers = [NSMutableDictionary new]; for (NSString *key in plist) { [handlers setObject: NSClassFromString([plist objectForKey: key]) forKey: key]; }
To instantiate a handler for the specified key, you then just do something like this:
[[handlers objectForKey: command] new];
You can use this same pattern to populate menus, possibly by providing a submenu for each plugin and a dictionary for mapping each menu item title to classes or selectors in the plugin. This makes implementing your own plugin infrastructure in Objective-C very easy. I didn't really need it for this application, but it was so easy to do that it took less time to implement and use than going through the recompile cycle a couple of times.
Because I'm running this code on a FreeBSD system with the GNUstep and èEtoilèe frameworks, instead of on OS X, I can extend it with new parser classes written in Smalltalk and JIT compiled.
The parser constructs a tree containing the semantic attributes. In the next article, I'll describe the structure of this tree and how it's then exported as HTML.