.NET Tools for Working with XML
A lot of people associate the .NET framework with XML, and for good reason. .NET uses XML behind the scenes to implement many of its development tools, such as SOAP and Web services. Beyond that, however, .NET provides a powerful set of classes for working with XML directly. Whatever you need to do with XML—sequential or random access, validation, transforms, or output—the .NET Framework provides you with tools that are not only powerful but easy to use.
This article provides an overview of the most important of these classes, and some examples of what you can do with them. All of .NET's XML classes are in the System.XML namespace, and support the following standards (listed with their WWW namespaces):
- XML 1.0 including DTDs
- XML Namespaces, both stream-level and DOM
- XSD Schemas
- XPath expressions
- XSLT transformations
- DOM Level 1 Core
- DOM Level 2 Core
XmlTextReader
The XmlTextReader class provides non-cached, forward-only access to a stream of XML data. It is designed specifically for fast access to XML data while placing minimal demands on the systemís resources. Functionally, XmlTextReader is similar to the Simple API for XML (SAX), another technique for reading XML that is popular with non-.NET programmers.
XmlTextReader steps through the XML data one node at a time. At each node, your program can use the class properties to obtain information about the node — its type (element or attribute, for example), data, number of attributes, and so on. You use the read method to advance to the next node, and the EOF property to determine when the end of the data has been reached.
This class does not perform data validation; that's one reason why it is so fast. Nor does it support default attributes or resolve external entities. It does, however, enforce the rules of well-formed XML, which makes it a good well-formedness parser. Because of its speed, it is also well-suited for looking through an XML file for a specific piece of information, or for processing an entire XML file sequentially as when you are generating HTML from the XML data.
Let's look at an example. Listing 1 shows part of an XML data file that will be used in this example. It is data from a checkbook register. Listing 2 shows Visual Basic code that will display the number of checks written to the category "groceries" and the total amount of those checks.
Listing 1. The XML data file used in the examples.
<?xml version="1.0"?> <checkbook> <check number="100" date="2004-04-05"> <payee>Wilson Oil Co.</payee> <amount>156.25</amount> <category>utilities</category> </check> <check number="101" date="2004-04-07"> <payee>Kroger Foods</payee> <amount>98.25</amount> <category>groceries</category> </check> <check number="102" date="2004-04-07"> <payee>Cancer Society</payee> <amount>100.00</amount> <category>charity</category> </check> </checkbook>
Listing 2. Using the XmlTextReader class to extract data from an XML file.
Dim rdr As XmlTextReader Dim amount As String Dim total As Single = 0 Dim count As Integer = 0 Dim isAmountElement As Boolean Dim isCategoryElement As Boolean Try rdr = New XmlTextReader("checkbook.xml") While rdr.Read() ' Look for a start node. If rdr.NodeType = XmlNodeType.Element Then ' Is it an "amount" or "category" element? 'If so set the corresponding flag. If rdr.Name = "amount" Then isAmountElement = True Else isAmountElement = False End If If rdr.Name = "category" Then isCategoryElement = True Else isCategoryElement = False End If End If If rdr.NodeType = XmlNodeType.Text Then ' Is it a "category" element with the value "groceries"? If so, increment ' the count and add the amount to the total. If isCategoryElement And rdr.Value = "groceries" Then count += 1 total += amount End If ' If it is an "amount" element, save the value for possible future use. If isAmountElement Then amount = rdr.Value End If End If End While Catch ex As Exception MsgBox("XML error " & ex.Message) Finally If Not rdr Is Nothing Then rdr.Close() End Try MsgBox("You wrote " & count.ToString & " checks for groceries totaling " _ & Format(total, "C"))