Seven Steps to XML Mastery, Step 4: Parsing and Processing XML (Part 2 of 2)
Working with SAX Filters
In part 1 of step 4, we looked at SAX and DOM parsers and found that SAX provided a simple way to extract data from streaming XML without incurring the high memory cost of building a DOM tree in memory. Another dimension of SAX is its ability to link handlers to create SAX filter chains. SAX filters are modeled after the pipe-and-filter architecture that allows apps to be built around filters that can exist without knowledge of other filters in a pipeline. In a filter-based application, the final behavior of a system depends on the composition and arrangement of the filters.
Filters are related to the UNIX concept of pipes, which give programmers the ability to string together a number of processes for a specific output. Originally developed by Doug McIlroy at Bell Labs, filters were made popular by Ken Thompson and Dennis Ritchie, who built pipes and filters into the UNIX operating system. Basically, a filter transforms (filters) the data it receives via the pipes (the connectors that pass data from one filter to the next).
SAX filters specialize the pipe-and-filter architecture with SAX handlers strung together to perform multiple tasks on incoming XML. SAX filters help keep things flexible by supporting the mixing-and-matching of different filters. This concept of filters fits well with the trend toward the component assembly of complex applications.
Linked SAX filters can do the following:
- Take independent action based on the incoming XML
- Postprocess SAX events and pass new events to the next filter in the chain
Figure 1 illustrates how SAX filters can be used to build complex XML applications—provided that you can break the application into a collection of simple SAX components, where each filter handles events passed to it from the previous filter in the chain.
A SAX filter sits between the client application and the actual parser. Because it’s positioned in the middle, the filter can change the event stream. For example, elements and attributes can be changed—not just their values, but their names. SAX filters can also add namespaces to elements, add elements and attributes not present in the original XML, and do logging work (for example, logging all data passing through to a database).
Figure 1 SAX filters can be used to build a pipe-and-filter architecture.