.NET's XML Architecture
.NET's core XML architecture (at least, as of Beta 2) is as you see it in Figure 3.4. This is slightly different than it was for Beta 1, in which the XPathNavigator in Figure 3.4 was called XmlNavigator. The basic layout is somewhat the same, however.
Figure 3.4 .NET's core XML architecture.
The main XML class that you'll use here is XPathNavigator. With this class, you can move through the XML document, using movement commands such as MoveToNext() and MoveToFirstChild(). Or, as the name suggests, you can provide the XPathNavigator an XPath query, and it will return to you an iterator that represents the nodeset returned from the query. Although it isn't discussed here, you have XML DOM Level 2 support using XmlDocument. We won't use XmlDocument here because we'll be primarily interested in XPath queries, and XPathDocument is optimized for quick searches. It might be helpful to know that XmlDocument exists if you're used to working with the XML DOM. However, we will be interested in using other .NET XML classes, starting with the .NET class that enables you to read XML data from a stream.
Reading XML Data
If you have an XML file on disk, XPathNavigator will accept the file's filename and path, and will happily load the file for processing. .NET, however, is stream-based, so it might not be surprising to find that .NET has an XML class designed for reading XML streams. This class is XmlReader, and if you have a stream that you know represents XML data, you can wrap that stream with a reader and it will manage movement through the stream in an XML-friendly fashion.
To see XmlReader in action, let's simulate the file read that XPathNavigator performs when loading XML data. Imagine that you have an XML file called Activities.xml. To load this file into .NET, and to demonstrate XmlReader, you would use this code:
FileStream fstmActivities = new FileStream("Activities.xml",FileMode.Open); XmlTextReader xrdActivities = new XmlTextReader(fstmActivities);
Note that XmlReader is an abstract class; you don't use it directly. Instead, you use a class derived from XmlReader, such as XmlTextReader. And although you could load the file directly into XmlTextReader, the stream-based constructor was used in this brief example simply because that's how you'll accept the SOAP information from .NET when processing a SoapExtension.
XmlReader gives you a fast, forward-only XML data stream. XmlTextReader is ideal for quickly determining whether your XML document is well-formed and allowing you to progress through the document. You'll see an example of XmlTextReader in action in the next section, "Writing XML Data."
Writing XML Data
If there is a .NET XmlReader, then there probably should be an equivalent writer. In fact, there isnot surprisingly, its class name is XmlWriter. And, like XmlReader, XmlWriter is an abstract class that must be implemented in a derived class. For XmlWriter, the only class provided by .NET for this is XmlTextWriter.
XmlTextWriter also accepts a stream in its constructor, and it is to this stream that XmlTextWriter will write the XML that is created or otherwise slated for output. Listing 3.2 shows both XmlTextReader and XmlTextWriter in action.
Listing 3.2 Reading and Writing XML Data Using .NET Framework Classes
using System; using System.IO; using System.Text; using System.Xml; namespace RdrWtr { /// <summary> /// Summary description for Class1. /// </summary> class Class1 { static void Main(string[] args) { try { // Create a file stream FileStream fstmXmlOut = new FileStream("MyXML.xml", FileMode.Create); // Create an encoding UTF8Encoding objEncoding = new UTF8Encoding(); // Create an XML text writer XmlTextWriter objXmlWriter = new XmlTextWriter(fstmXmlOut,objEncoding); // Create some XML objXmlWriter.WriteStartDocument(); objXmlWriter.WriteStartElement("m", "Employees", "http://www.myurl.com"); objXmlWriter.WriteAttributeString("xmlns", "m", null, "http://www.myurl.com"); // Write an employee element objXmlWriter.WriteStartElement("m", "Employee", "http://www.myurl.com"); objXmlWriter.WriteStartAttribute("m", "id", "http://www.myurl.com"); objXmlWriter.WriteString("175-A15"); objXmlWriter.WriteEndAttribute(); objXmlWriter.WriteStartElement("m", "Name", "http://www.myurl.com"); objXmlWriter.WriteString("Kenn Scribner"); objXmlWriter.WriteEndElement(); // Name objXmlWriter.WriteStartElement("m", "Title", "http://www.myurl.com"); objXmlWriter.WriteString("Code Gecko"); objXmlWriter.WriteEndElement(); // Title objXmlWriter.WriteEndElement(); // Employee // Write another employee element objXmlWriter.WriteStartElement("m", "Employee", "http://www.myurl.com"); objXmlWriter.WriteStartAttribute("m", "id", "http://www.myurl.com"); objXmlWriter.WriteString("129-B68"); objXmlWriter.WriteEndAttribute(); objXmlWriter.WriteStartElement("m", "Name", "http://www.myurl.com"); objXmlWriter.WriteString("Mark Stiver"); objXmlWriter.WriteEndElement(); // Name objXmlWriter.WriteStartElement("m", "Title", "http://www.myurl.com"); objXmlWriter.WriteString("Code Godzilla"); objXmlWriter.WriteEndElement(); // Title objXmlWriter.WriteEndElement(); // Employee // Finish off the document objXmlWriter.WriteEndElement(); // Employees // Flush it to the file objXmlWriter.Flush(); objXmlWriter.Close(); fstmXmlOut.Close(); // Create a file stream FileStream fstmXmlIn = new FileStream("MyXML.xml", FileMode.Open); // Create an XML text writer XmlTextReader objXmlReader = new XmlTextReader(fstmXmlIn); while (objXmlReader.Read()) { switch (objXmlReader.NodeType) { case XmlNodeType.XmlDeclaration: Console.WriteLine("<?xml version=\"1.0\"?>"); break; case XmlNodeType.Element: Pad(objXmlReader.Depth); Console.Write("<{0}", objXmlReader.Name); if ( objXmlReader.HasAttributes ) { bool bAttr = objXmlReader.MoveToFirstAttribute(); while ( bAttr ) { Console.Write(" {0}=\"{1}\"", objXmlReader.Name, objXmlReader.Value); bAttr = objXmlReader.MoveToNextAttribute(); } // while } // if Console.WriteLine(">"); break; case XmlNodeType.Text: Pad(objXmlReader.Depth); Console.WriteLine(objXmlReader.Value); break; case XmlNodeType.EndElement: Pad(objXmlReader.Depth); Console.WriteLine("</{0}>", objXmlReader.Name); break; default: break; } // switch } // while // Close the file objXmlReader.Close(); fstmXmlIn.Close(); } // try catch (Exception ex) { Console.WriteLine("Exception: {0}\n",ex.Message); } // catch } static void Pad(int iDepth) { for ( int i = 0; i < iDepth; i++ ) { Console.Write(" "); } // for } } }
As you see in Listing 3.2, we provide XmlTextReader and XmlTextWriter with streams instead of asking them to open the files directly. This is again because you'll be given streams of XML when dealing with SOAP from within .NET in Chapter 6.
Creating XML elements with XmlTextWriter is a simple matter of deciding what type of element to create and then writing the element information to the stream using the XmlTextWriter method that is most appropriate. We used the combination of WriteStartElement(), WriteString(), and WriteEndElement(), but you could have also used WriteElementString() or created entirely different sorts of elements, like those dedicated to a DTD or comment.
After we created the XML file, we read it back into memory and displayed it on the console screen using XmlTextReader. In this case, to format the output in more familiar terms, we determine what type of element XmlTextReader is currently indicating and spit out formatted text accordingly. We treated the start of an element differently than we treated the content or ending element tag, for example.
It's nice having XmlTextReader and XmlTextWriter, but they work at a rather low level when it comes to dealing with XML. Next you'll see how to move up the abstraction hierarchy and work with XML at a slightly higher level.
Navigating XML with .NET
When people who know .NET think of navigating an XML document, they probably think of XPathDocument and XPathNavigator before XmlReader and XmlWriter. This is because these two XPath classes provide you with the capability to deal with the XML document in a couple different ways. First, you can access the XML information using a pull model. That is, you obtain XML information when you request it. Second, you have the option of providing an XPath location step to extract a nodeset, which you access using XPathNodeIterator. In most cases, you'll probably use a combination of the two.
Pulling XML Element Information with .NET
If you are familiar with XML and XML technologies, you've undoubtedly heard of SAX, the Simple API for XML. We even mentioned it earlier in the chapter. SAX is a stream-based approach to reading XML information. You create callback functions that the SAX parser will invoke when it senses information that you're interested in accepting. That is, if you create a callback function for the start of an element tag, SAX will invoke your callback function whenever it reads a starting tag. This is a push model because SAX is shoving XML your way as it encounters the information within the document.
.NET, however, works the other way. .NET provides you with a water fountain instead of a fire hose, and you can sip from the fountain when you want. As you sip (read XML element information), more information is readied for your next request. You pull in more information as you're ready.
The .NET XPath classes expose a window into which you examine the XML. Whenever you're working with these classes, it's important to remember that you are working with only a single XML element at any one time. To work with another XML element, you must move the "window." If you're new to .NET and .NET's XML handling capabilities, this is probably the hardest concept to comprehend. This is most likely because of the nature of the classesby appearance, you perform an element-based action upon the entire XML document, which at first seems odd.
For example, say that you have an instance of XPathNavigator, which contains an XML document. At first, it seems odd to execute code written like this:
` objXPN is an instance of XPathNavigator that contains ` an XML document Dim strElementValue As String = objXPN.Value
This seems odd because it isn't apparent that when you ask for the "value" of the XPathNavigator, you're really asking for the value associated with the XML element that the XPathNavigator object is currently referencing (the XML element in the "window"). If you "pull" more data, by executing XPathNavigator's MoveToNext() method, for example, the XML element sitting in the window will change and you'll retrieve the value for that element rather than the initial element.
When you're comfortable with the pull model, however, it makes a lot of sense. With SAX, whenever one of your callback methods is invoked, you must deal with that piece of data then and there. If you want that information cached in any way, you implement the caching. With XPathNavigator, though, the data waits for you. When you're ready for more data, just pull in more data. We'll demonstrate this more graphically in the next section, ".NET and XPath," where you'll learn about retrieving a nodeset using XPath only to recursively examine the nodeset as the results are displayed in a .NET Windows Forms tree view.
.NET and XPath
You've already done the hard work if you've created and debugged the XPath query that you intend to apply to an XML document contained within XPathDocument. To execute the query, you create an instance of the associated XPathNavigator and execute its Select() method. Select() accepts as input the XPath location step as a string and returns in response an instance of XPathIterator. You then use XPathIterator to rummage through the nodeset.
The code that you'll need to perform the XPath query is simply this:
Dim objNodeItr As XPathNodeIterator = _ objXPathNav.Select({XPath query text})
Of course, a bit more is involved to set up the objects, but making the Select() call is about all you really need. Given the iterator, you can access all the nodes in the nodeset, extracting data from each as necessary.
To demonstrate this, we created a utility, called XPathExerciser, to write and test XPath queries. A portion of the Windows Forms application is shown in Listing 3.3.
Listing 3.3 XPath Queries Using XPathNavigator
Private Sub cmdQuery_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) _ Handles cmdQuery.Click ` Execute the query Try ` Check for text If txtQuery.Text.Length >= 1 Then ` Perform the query Dim objNodeItr As XPathNodeIterator = _ m_objXPathNav.Select(txtQuery.Text) If Not objNodeItr Is Nothing Then ` Iterate through the nodeset and display in the ` tree control FillNodesetTree(objNodeItr) Else ` No nodes to add... tvwNodeSet.Nodes.Add("No nodes returned fromXPath query...") End If End If Catch ex As Exception ` Show error MsgBox(ex.Message, _ MsgBoxStyle.Critical, _ "XPath Exerciser Error") tvwNodeSet.Nodes.Clear() tvwNodeSet.Nodes.Add(New TreeNode("***Error executing XPath query...", 6, 6)) End Try End Sub Private Sub FillNodesetTree(ByVal objNodeItr As XPathNodeIterator) ` Clear the tree tvwNodeSet.Nodes.Clear() ` Cut off screen update tvwNodeSet.BeginUpdate() ` Create the root node Dim node As TreeNode = New TreeNode("(Context node)", 6, 6) tvwNodeSet.Nodes.Add(node) ` Advance through the nodeset While objNodeItr.MoveNext AddTreeNode(objNodeItr.Current, node) End While ` Update screen tvwNodeSet.EndUpdate() End Sub Private Sub AddTreeNode(ByRef objCurrXMLNode As XPathNavigator, _ ByRef nodParent As TreeNode) Try ` Create the new node Dim node As TreeNode Select Case objCurrXMLNode.NodeType Case XPathNodeType.Text nodParent.Nodes.Add(New TreeNode(objCurrXMLNode.Name & _ "{" & objCurrXMLNode.Value & "}", 2, 2)) Exit Sub Case XPathNodeType.Comment nodParent.Nodes.Add(New TreeNode(objCurrXMLNode.Name & _ "{" & objCurrXMLNode.Value & "}", 4, 4)) Exit Sub Case XPathNodeType.Attribute nodParent.Nodes.Add(New TreeNode(objCurrXMLNode.Name & _ "{" & objCurrXMLNode.Value & "}", 3, 3)) Exit Sub Case XPathNodeType.Root node = New TreeNode("{root}", 1, 1) Case Else node = New TreeNode(objCurrXMLNode.Name, 0, 0) End Select ` Add ourselves to the tree nodParent.Nodes.Add(node) ` Look for attributes Dim objNodeClone As XPathNavigator If objCurrXMLNode.HasAttributes Then objNodeClone = objCurrXMLNode.Clone objNodeClone.MoveToFirstAttribute() AddTreeNode(objNodeClone, node) While objNodeClone.MoveToNextAttribute AddTreeNode(objNodeClone, node) End While End If ` Find children objNodeClone = objCurrXMLNode.Clone If objNodeClone.HasChildren Then objNodeClone.MoveToFirstChild() AddTreeNode(objNodeClone, node) While objNodeClone.MoveToNext() AddTreeNode(objNodeClone, node) End While End If Catch ex As Exception ` Show error MsgBox(ex.Message, _ MsgBoxStyle.Critical, _ "XPath Exerciser Error") tvwNodeSet.Nodes.Clear() tvwNodeSet.Nodes.Add(New TreeNode("***Error executing XPath query...", 6, 6)) End Try End Sub
XPathExerciser enables you to select an XML file against which you want to apply XPath queries. XPathExerciser then displays the file in the Web Browser ActiveX control after you've browsed for your particular XML file. Then, with the file in hand, you type in your XPath query and click the Query button. If things go well, the tree control displays the resulting nodeset. If this operation sounds familiar, it should. As you might remember, the XPathExerciser user interface was presented in Figure 3.2.
What is important to see from Listing 3.3 is the combination of XPath and pull methods, such as MoveToFirstChild() and MoveToNext(). Note also that, given an instance of XPathNavigator, you can recursively read the XML element information and build a tree view.
NOTE
This book was written with the second beta version of Visual Studio .NET and the .NET Framework. As of this second beta, an issue exists with ActiveX and COM interoperability: If the source files were created on the Windows XP operating system (originally called Whistler) and later were transferred to an older version of Windows, the source files will be modified by Visual Studio .NET and the reference to the ActiveX control will be dropped. When Visual Studio .NET drops the ActiveX control, it also drops any sibling controls (controls at the same user interface scope, such as when grouped on a form or group box). Therefore, we have included two versions of XPathExerciser, one for XP and one for other versions of Windows. The executable itself executes without error in either case, and you can mix and match executables at will. Just the source and project files present problems.
.NET and XLink
You might be wondering if special support for XLink is built into the .NET Framework. There is, but it requires a bit of additional overhead.
XPathNavigator has a method, MoveToId(), that is designed to work with id attributes. The additional overhead mentioned is that the id attribute must be explicitly defined in the XML DTD or schema that is associated with your XML document. If you have no DTD or schema, MoveToId() won't work for you. However, you still may use XPath to search for all elements with an id attribute, or even an id attribute containing a specific identifying value.
.NET and XSL
The final general-purpose XML technology we mentioned is XSL, and .NET supports XSL through its XSLTransform class. If you load an XML document into an instance of XPathDocument and then load an XSL style sheet into XSLTransform, you can run the style sheet in this manner:
objXSL.Transform(objXMLDoc,null,objWriter);
Here, objXSL is the XSLTransform instance, objXMLDoc is the XPathDocument, and objWriter is the XmlTextWriter that you'll use to persist the changes.
To provide a useful demonstration of transformations within the .NET Framework, we created the XSLExerciser application, whose user interface you see in Figure 3.5. If you browse for an XML file and an associated style sheet, you can click the XForm button to execute the transform. If you've elected to save the results, the new XML document will be recorded within the file that you specified. You can also view the output either as text or as XML contained within the Web Browser control.
Figure 3.5 The XSLExerciser user interface.
Listing 3.5 shows you how to handle the transform action itself. The code that you see is executed when you click the XForm button. You don't need to load any files until you actually want to perform the transformation, so if you need to adjust the style sheet, as often happens, you can do so easily and retransform the document.
Listing 3.4 XPath Queries Using XPathNavigator
private void cmdXForm_Click(object sender, System.EventArgs e) { try { // Open the original XML document m_objXMLDoc = new XPathDocument(m_strXMLPath); // Open the stylesheet XslTransform objXSL = new XslTransform(); objXSL.Load(m_strXSLPath); // Create a stream to contain the results m_stmOutput = new MemoryStream(); // Create and overlay an XML writer ASCIIEncoding objEncoding = new ASCIIEncoding(); XmlTextWriter objWriter = new XmlTextWriter(m_stmOutput, objEncoding); // Do the transform objXSL.Transform(m_xpdXMLDoc,null,objWriter); // Save output as a string byte[] bytes = m_stmOutput.ToArray(); m_strXMLOutput = objEncoding.GetString(bytes); // Check to see if we need to save the file... if ( optSave.Checked && m_strOutPath.Length > 0 ) { FileStream fstmXMLOut = new FileStream(m_strOutPath, FileMode.Create); m_stmOutput.Position = 0; TextReader objTRFrom = new StreamReader(m_stmOutput); TextWriter pbjTWTo = new StreamWriter(fstmXMLOut); twTo.WriteLine(trFrom.ReadToEnd()); twTo.Flush(); fstmXMLOut.Close(); } // if // Close the stream m_stmOutput.Close(); // Enable the UI cmdView.Enabled = true; // Show results lblResults.ForeColor = SystemColors.ControlText; lblResults.Text = "Transformation successful, now click View"; } // try catch (Exception ex) { // Show results lblResults.ForeColor = Color.Red; lblResults.Text = "Transformation Unsuccessful"; MessageBox.Show(ex.Message, "XSLExerciser Error", MessageBoxButtons.OK, MessageBoxIcon.Error); } // catch }
A lot of this code you saw previously. One thing that we do differently is to create a memory-based stream, into which we shove the transformed XML:
// Create a stream to contain the results m_stmOutput = new MemoryStream();
We do this because we then have the transformed XML in one location, ready to display or store into a file.
No single book has everything that you'll need to work with a technology as diverse as XML. However, a few books are well worth examining to gain a bit more detail regarding this fascinating and pliant tool. Unfortunately, as of this printing, no .NET-specific XML books are available, but hopefully this chapter has given you the basics that you'll need to move forward into the .NET world.