SAX and State
Using a streaming event parser such as SAX means managing state within your applications, since SAX provides only a small window into your document. In Listing 1, the startElement method on line 10 checks the element name. If the element name is book, the method steps through the attributes until it finds isbn. The simplicity of the overall solution is made possible by the fact that we’re simply looking to extract the value of an isbn attribute from a book element. But if the XML gets more complex, so will our logic. For example, if the XML includes several different book elements (some for orders, some for returns, some for restoration), each with different parent elements, our SAX program would need to track the parents with each startElement call and then forget them as endElements are encountered. Our simple SAX program gets more complex and potentially more error-prone with the addition of state maintenance code.
This is why the ZwiftBooks team, in our previous article discussion, balked at having to handle an XML format from AsiaBooks that was different from that of EuroBooks. Different document structures imply different state management—in effect, a different parse is needed to locate the data.