- XML Reference Guide
- Overview
- What Is XML?
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Table of Contents
- The Document Object Model
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- DOM and Java
- Informit Articles and Sample Chapters
- Books and e-Books
- Implementations
- DOM and JavaScript
- Using a Repeater
- Repeaters and XML
- Repeater Resources
- DOM and .NET
- Informit Articles and Sample Chapters
- Books and e-Books
- Documentation and Downloads
- DOM and C++
- DOM and C++ Resources
- DOM and Perl
- DOM and Perl Resources
- DOM and PHP
- DOM and PHP Resources
- DOM Level 3
- DOM Level 3 Core
- DOM Level 3 Load and Save
- DOM Level 3 XPath
- DOM Level 3 Validation
- Informit Articles and Sample Chapters
- Books and e-Books
- Documentation and Implementations
- The Simple API for XML (SAX)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- SAX and Java
- Informit Articles and Sample Chapters
- Books and e-Books
- SAX and .NET
- Informit Articles and Sample Chapters
- SAX and Perl
- SAX and Perl Resources
- SAX and PHP
- SAX and PHP Resources
- Validation
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Document Type Definitions (DTDs)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XML Schemas
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- RELAX NG
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Schematron
- Official Documentation and Implementations
- Validation in Applications
- Informit Articles and Sample Chapters
- Books and e-Books
- XSL Transformations (XSLT)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XSLT in Java
- Java in XSLT Resources
- XSLT and RSS in .NET
- XSLT and RSS in .NET Resources
- XSL-FO
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XPath
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XML Base
- Informit Articles and Sample Chapters
- Official Documentation
- XHTML
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XHTML 2.0
- Documentation
- Cascading Style Sheets
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XUL
- XUL References
- XML Events
- XML Events Resources
- XML Data Binding
- Informit Articles and Sample Chapters
- Books and e-Books
- Specifications
- Implementations
- XML and Databases
- Informit Articles and Sample Chapters
- Books and e-Books
- Online Resources
- Official Documentation
- SQL Server and FOR XML
- Informit Articles and Sample Chapters
- Books and e-Books
- Documentation and Implementations
- Service Oriented Architecture
- Web Services
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Creating a Perl Web Service Client
- SOAP::Lite
- Amazon Web Services
- Creating the Movable Type Plug-in
- Perl, Amazon, and Movable Type Resources
- Apache Axis2
- REST
- REST Resources
- SOAP
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- SOAP and Java
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- WSDL
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- UDDI
- UDDI Resources
- XML-RPC
- XML-RPC in PHP
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Ajax
- Asynchronous Javascript
- Client-side XSLT
- SAJAX and PHP
- Ajax Resources
- JSON
- Ruby on Rails
- Creating Objects
- Ruby Basics: Arrays and Other Sundry Bits
- Ruby Basics: Iterators and Persistence
- Starting on the Rails
- Rails and Databases
- Rails: Ajax and Partials
- Rails Resources
- Web Services Security
- Web Services Security Resources
- SAML
- Informit Articles and Sample Chapters
- Books and e-Books
- Specification and Implementation
- XML Digital Signatures
- XML Digital Signatures Resources
- XML Key Management Services
- Resources for XML Key Management Services
- Internationalization
- Resources
- Grid Computing
- Grid Resources
- Web Services Resource Framework
- Web Services Resource Framework Resources
- WS-Addressing
- WS-Addressing Resources
- WS-Notifications
- New Languages: XML in Use
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Google Web Toolkit
- GWT Basic Interactivity
- Google Sitemaps
- Google Sitemaps Resources
- Accessibility
- Web Accessibility
- XML Accessibility
- Accessibility Resources
- The Semantic Web
- Defining a New Ontology
- OWL: Web Ontology Language
- Semantic Web Resources
- Google Base
- Microformats
- StructuredBlogging
- Live Clipboard
- WML
- XHTML-MP
- WML Resources
- Google Web Services
- Google Web Services API
- Google Web Services Resources
- The Yahoo! Web Services Interface
- Yahoo! Web Services and PHP
- Yahoo! Web Services Resources
- eBay REST API
- WordML
- WordML Part 2: Lists
- WordML Part 3: Tables
- WordML Resources
- DocBook
- Articles
- Books and e-Books
- Official Documentation and Implementations
- XML Query
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XForms
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Resource Description Framework (RDF)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Topic Maps
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation, Implementations, and Other Resources
- Rich Site Summary (RSS)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Simple Sharing Extensions (SSE)
- Atom
- Podcasting
- Podcasting Resources
- Scalable Vector Graphics (SVG)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- OPML
- OPML Resources
- Summary
- Projects
- JavaScript TimeTracker: JSON and PHP
- The Javascript Timetracker
- Refactoring to Javascript Objects
- Creating the Yahoo! Widget
- Web Mashup
- Google Maps
- Indeed Mashup
- Mashup Part 3: Putting It All Together
- Additional Resources
- Frequently Asked Questions About XML
- What's XML, and why should I use it?
- What's a well-formed document?
- What's the difference between XML and HTML?
- What's the difference between HTML and XHTML?
- Can I use XML in a browser?
- Should I use elements or attributes for my document?
- What's a namespace?
- Where can I get an XML parser?
- What's the difference between a well-formed document and a valid document?
- What's a validating parser?
- Should I use DOM or SAX for my application?
- How can I stop a SAX parser before it has parsed the entire document?
- 2005 Predictions
- 2006 Predictions
- Nick's Book Picks
In the world of XML, it's natural to think of XML in terms of two contexts: DOM and SAX. DOM is flexible, in that it provides the ability to navigate around the document tree and make changes, but SAX is fast, in that it doesn't load the entire document into memory, looking instead at just a single node at a time in a forward-only, read-only manner.
But .NET doesn't readily support SAX. Why? Because Microsoft found a way to provide much of the same benefit without the complex programming that SAX requires. The XmlReader object is like a cross between a DOM document and a read-only, forward-only ADO cursor. It allows you to easily access the properties of the current node (including some measure of positioning within the document), but also provides the speed and power of a lightweight object. In fact, unless you're going to make changes to the document tree itself, you'll probably find XmlReader much easier and more convenient to use than the XmlDocument objects.
In DOM and .NET, we looked at what it takes to navigate a document, so let's look at what it takes using an XmlReader implementation. Let's say that we wanted to look at a tidied up version of the candy output:
<candy> <product productNumber="Product 0"> MINTS <updated>1/11/2004 9:37:35 PM</updated> </product> <product productNumber="Product 1"> CHOCOLATE <updated>1/11/2004 9:37:35 PM</updated> </product> <product productNumber="Product 2"> CIRCUS PEANUTS <updated>1/11/2004 9:37:35 PM</updated> </product> </candy>
We'd go about it by creating an XmlReader object and looping through each node:
Imports System Imports System.IO Imports System.Xml public class ReaderSample public shared sub Main() dim reader as XmlTextReader = new XmlTextReader("candyout.xml") while reader.Read() Indent(reader.Depth) Console.WriteLine("Name: [{0}] Type: [{1}] Value: [{2}] ", _ reader.Name, reader.NodeType, reader.Value) end while end sub public shared sub Indent(depth as Integer) dim i as Integer for i = 0 to depth Console.Write(" ") next end sub end class
First we're creating the XmlReader object. Note that XmlReader is an abstract class; you must use one of the three overriding classes, XmlTextReader (for serialized text, such as a file), XmlNodeReader (for documents that have already been parsed into an XmlNode object), and XmlValidatingReader, for enforcing DTDs and XML Schemas.
When using an XmlReader, you look at one node at a time, using the Read() method. Each time you call the Read() method, it advances to the next node. From there, you can access information about the node through the members of the class, such as Value and Name. Note that at any given time, the reader object represents the current node rather than the overall document.
When the application gets to the end of the document, Read() returns false and it exits the loop.
The output at this stage is simple:
Name: [candy] Type: [Element] Value: [] Name: [] Type: [Whitespace] Value: [ ] Name: [product] Type: [Element] Value: [] Name: [] Type: [Text] Value: [ MINTS ] Name: [updated] Type: [Element] Value: [] Name: [] Type: [Text] Value: [1/11/2004 9:37:35 PM] Name: [updated] Type: [EndElement] Value: [] Name: [] Type: [Whitespace] Value: [ ] Name: [product] Type: [EndElement] Value: [] Name: [] Type: [Whitespace] Value: [ ] Name: [product] Type: [Element] Value: [] Name: [] Type: [Text] Value: [ CHOCOLATE ] Name: [updated] Type: [Element] Value: [] Name: [] Type: [Text] Value: [1/11/2004 9:37:35 PM] Name: [updated] Type: [EndElement] Value: [] Name: [] Type: [Whitespace] Value: [ ] Name: [product] Type: [EndElement] Value: [] Name: [] Type: [Whitespace] Value: [ ] Name: [product] Type: [Element] Value: [] Name: [] Type: [Text] Value: [ CIRCUS PEANUTS ] Name: [updated] Type: [Element] Value: [] Name: [] Type: [Text] Value: [1/11/2004 9:37:35 PM] Name: [updated] Type: [EndElement] Value: [] Name: [] Type: [Whitespace] Value: [ ] Name: [product] Type: [EndElement] Value: [] Name: [] Type: [Whitespace] Value: [ ] Name: [candy] Type: [EndElement] Value: []
We can clean it up slightly by telling the application to ignore whitespace:
... dim reader as XmlTextReader = new XmlTextReader("candyout.xml") reader.WhitespaceHandling = WhitespaceHandling.None while reader.Read() ...
Now let's look at the results:
Name: [candy] Type: [Element] Value: [] Name: [product] Type: [Element] Value: [] Name: [] Type: [Text] Value: [ MINTS ] Name: [updated] Type: [Element] Value: [] Name: [] Type: [Text] Value: [1/11/2004 9:37:35 PM] Name: [updated] Type: [EndElement] Value: [] Name: [product] Type: [EndElement] Value: [] Name: [product] Type: [Element] Value: [] Name: [] Type: [Text] Value: [ CHOCOLATE ] Name: [updated] Type: [Element] Value: [] Name: [] Type: [Text] Value: [1/11/2004 9:37:35 PM] Name: [updated] Type: [EndElement] Value: [] Name: [product] Type: [EndElement] Value: [] Name: [product] Type: [Element] Value: [] Name: [] Type: [Text] Value: [ CIRCUS PEANUTS ] Name: [updated] Type: [Element] Value: [] Name: [] Type: [Text] Value: [1/11/2004 9:37:35 PM] Name: [updated] Type: [EndElement] Value: [] Name: [product] Type: [EndElement] Value: [] Name: [candy] Type: [EndElement] Value: []
Notice that the structure is much like that of the XML document itself, with both the start and the end of each element noted, and the "depth" easily accessible. Both of these features can be achieved with SAX -- but not easily.
We can also look at the attributes on each element:
... while reader.Read() Indent(reader.Depth) Console.WriteLine("Name: [{0}] Type: [{1}] Value: [{2}] ", _ reader.Name, reader.NodeType, reader.Value) if reader.HasAttributes then Indent(reader.Depth) Indent(reader.Depth) Console.Write("Attributes: ") Dim i As Integer For i = 0 To reader.AttributeCount - 1 reader.MoveToAttribute(i) Console.Write(" {0}=[{1}]", reader.Name, reader.Value) Next i reader.MoveToElement() Console.WriteLine("") end if end while end sub ...
If attributes exist on the current node, we can output their value by accessing them directly by index. (You can also pair the MoveToFirstAttribute() and MoveToNextAttribute() methods.) When you move to an attribute, that attribute becomes the current node, so members such as Name and Value refer to that attribute. When you're done with those attributes, you can use the MoveToElement() method to return to the element that contains the attributes.
The XmlReader classes have a variety of useful methods we haven't even touched on, but this should be enough to get you started. If you're only reading XML data, chances are an XmlReader is the way to go.