An Overview of XML Serialization
The main class used in XML serialization is the XmlSerializer class, with its Serialize and Deserialize methods. Nearly everything else in the System.Xml.Serialization namespace has to do with modifying the behavior or output of the XmlSerializer. XML serialization persists only public read-write properties and fields. It does not save methods, private fields or properties, read-only properties, or indexers.
Basic XML Serialization
How do you actually serialize an object to XML? First, you need to create an object to serialize; for the purposes of this demonstration, let's create a class that can hold all the information associated with a particular link in a web-based menu. We'll need to store the URL itself, of course, and a title for the link (the text to display for the link). In addition, we'll include a ToolTip property to provide an extra description of the link when the user hovers over it, and a flag that indicates whether the link is external to the current site. This "IsExternal" flag enables us to treat external links differently from links to other pages in the same site, whether it's displaying a graphic next to external links (as Microsoft does) or opening external links in a different browser window.
Creating an Object to Serialize
Start by creating a new C# Console application project in Visual Studio .NET. Then add a new class to the project called MenuLink.cs. Recall that the restrictions specific to XML serialization state that you need a default (or parameterless) constructor and that only public read/write properties and fields are serialized. With the four properties we've already defined, the code for the MenuLink class looks like this:
public class MenuLink { private string _Url, _Title, _ToolTip; private bool _IsExternal; public MenuLink() { _Url = ""; _Title = ""; _ToolTip = ""; _IsExternal = false; } public string Url { get { return _Url; } set { _Url = value; } } public string Title { get { return _Title; } set { _Title = value; } } public string ToolTip { get { return _ToolTip; } set { _ToolTip = value; } } public bool IsExternal { get { return _IsExternal; } set { _IsExternal = value; } } }
This class does not contain any extra code or attributes that have to do with XML serialization. This characteristic of XML serialization makes it so easy to use. Because it works with only public read/write properties and fields, it can be used with virtually any object without modifying the object.
Serializing an Object to XML
Now that you have an object to serialize, you can begin the process of converting an instance of that object to its XML representation:
-
Switch back to the default Class1.cs file that was created for you when you opened the project.
-
Add a few using statements to the file:
-
Inside the Main method, create an instance of the MenuLink object and populate it with some data:
-
Create an instance of the XmlSerializer object. The XmlSerializer constructor takes a Type object that gives it the information on the public properties and fields that it needs to perform the serialization.
-
If you look at the MSDN documentation on the XmlSerializer.Serialize method, you can see that it accepts a Stream, a TextWriter, or an XmlWriter as the object to which to serialize. This allows for a great deal of flexibility because the output stream could be pointing to a text file, to a database, to a location in memory, to the ASP.NET Response.Output object (allowing serialization directly to the browser), or quite a few other locations. In this case, write the XML to a text file:
-
Create an XmlTextWriter object that points to the FileStream. Set the Formatting property to Indented, and pass the XmlTextWriter to the XmlSerializer instead of passing the FileStream directly.
-
You're ready to perform the serialization. Wrap the call to Serialize in a try block and close the XmlTextWriter in the finally block so that you can ensure that the file isn't held open if something unforeseen happens.
Compile and run the project. You've successfully serialized the MenuLink object to an XML file! Here's what the generated link.xml file looks like:
using System; using System.IO; using System.Xml; using System.Xml.Serialization;
// create and populate a new MenuLink object MenuLink link = new MenuLink(); link.Title = "GotDotNet"; link.Url = "http://www.gotdotnet.com/"; link.ToolTip = "Click here for Microsoft's GotDotNet developer
site."; link.IsExternal = true;
Now that you have an object with data in it, you can serialize the object to an XML file in the current directory.
Console.WriteLine("Serializing link to XML..."); XmlSerializer serializer = new XmlSerializer(typeof(MenuLink));
In this example, you're using the C# typeof keyword to get the Type object for the MenuLink. This could also be accomplished by calling the GetType method on the existing MenuLink instance, link.
// serialize the object to a file string path = ".\\link.xml"; FileStream fs = File.OpenWrite(path);
You could just stop here and pass the FileStream object directly to the overload of XmlSerializer.Serialize that accepts a Stream object, but that writes an unformatted mass of XML to the text file. With just two more lines of code, you can have nicely formatted and indented XML written to the text filethis gives the XML better readability.
XmlTextWriter writer = new XmlTextWriter(fs,
System.Text.Encoding.UTF8); writer.Formatting = Formatting.Indented;
try { // perform the XML serialization serializer.Serialize(writer, link); } finally { // close the writer, which closes the underlying FileStream writer.Close(); } Console.WriteLine("Serialization Complete!");
<?xml version="1.0" encoding="utf-8"?> <MenuLink xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <Url>http://www.gotdotnet.com/</Url> <Title>GotDotNet</Title> <ToolTip>Click here for Microsoft's GotDotNet developer site.</ToolTip> <IsExternal>true</IsExternal> </MenuLink>
Deserializing an Object from XML
You've created an XML file from a live object, but what about deserialization? How do you get a live object back from that XML file? The XmlSerializer makes this task just as easy, if not easier. It starts the same way:
-
Create an instance of the XmlSerializer and initialize it with the Type object for the MenuLink class. For the purposes of this demonstration, you can comment out the serialization code that you've just been through in the Main routine and insert the deserialization code in its place. You create the XmlSerializer object in exactly the same way that you did for serializing the MenuLink.
-
Open the XML file for reading:
-
Declare a new MenuLink instance and perform the actual deserialization. Again, you're going to wrap the action in a try block and close the FileStream in the corresponding finally block.
Finally, you can output the properties of the newly loaded MenuLink to verify that everything was correctly deserialized:
Console.WriteLine("Deserializing link from XML..."); XmlSerializer serializer = new XmlSerializer(typeof(MenuLink));
string path = ".\\link.xml"; FileStream fs = File.OpenRead(path);
MenuLink loadedLink; try { loadedLink = (MenuLink)serializer.Deserialize(fs); } finally { fs.Close(); }
As you can see, you're passing the FileStream object to the Deserialize method as the source from which to deserialize. Again, this allows for flexibility because the stream's source could be many thingsincluding, in this case, a simple local file. The Deserialize method always returns an object type, which must then be cast to the correct destination type before it's used.
Console.WriteLine("Link information:"); Console.WriteLine("Title: {0}", loadedLink.Title); Console.WriteLine("URL: {0}", loadedLink.Url); Console.WriteLine("ToolTip: {0}", loadedLink.ToolTip); Console.WriteLine("IsExternal: {0}", loadedLink.IsExternal);
That's it. With a relative handful of code, you serialized an arbitrary object to XML and restored it to its original state from that XML.
Using Attributes to Control XML Serialization
If the operation illustrated in the previous example were all that XML serialization could do, it would be a useful tool for many developers. However, closer examination reveals that the System.Xml.Serialization namespace contains a number of attribute classes that can control the format of the XML generated during serialization of any given object. These attributes, when applied to a class or its properties, enable you to control whether a given item is serialized as an XML element or an attribute on an existing element, what the name of the element or attribute should be, what XSD data type is specified in the XML, and many other aspects of serialization over which you might want fine-grained control. This can be useful if you need your serialized XML to conform to a certain schema, as we discover in the next section. This level of control can also be useful for simply making the serialized output look a certain way.
Returning to the serialized MenuLink object, take a closer look at the default XML generated by the serialization process:
<?xml version="1.0" encoding="utf-8"?> <MenuLink xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <Url>http://www.gotdotnet.com/</Url> <Title>GotDotNet</Title> <ToolTip>Click here for Microsoft's GotDotNet developer site.</ToolTip> <IsExternal>true</IsExternal> </MenuLink>
As you can see, the MenuLink class becomes the root element of the XML document, with an element name that matches the name of the class. Each property becomes a subelement, with a name that exactly matches the name of the property. This is the method that the XmlSerializer object follows when it is not given any special instructions on how to format a class. It's good enough for many situations, but say that you have a requirement that the root element of a link should be named simply "Link" instead of "MenuLink". In addition, you might be required to make the IsExternal property serialize as an attribute of the new Link element, but this attribute should be named simply "External" rather than "IsExternal".
To change the name of the root element in this case, go back to the MenuLink class in MenuLink.cs:
-
Add a new using statement so that you have easy access to the attribute classes in the System.Xml.Serialization namespace:
-
Because the class itself becomes the root element of the XML document, you need to add an XmlRootAttribute to the class definition, like this:
-
You've attached an XmlRoot attribute to the MenuLink class, but because the class is already rendered as the XML root element, you really haven't changed much yet. To change the name of the root element, modify the ElementName property of the XmlRoot attribute that we're using:
-
To make the IsExternal property into an attribute rather than an element, a similar application of the XmlAttribute attribute to the property declaration is needed:
-
Return to the Class1.cs file and comment out the deserialization code. Then uncomment the original serialization code that you used to generate the first version of the link.xml file.
using System.Xml.Serialization;
[XmlRootAttribute()] public class MenuLink { ... }
You might have noticed that attributes have their own special syntax that is not shared by any other programming element. Because of this, many people find that typing the word "Attribute" on an attribute to be redundant, especially when working with items such as the XmlAttributeAttribute attribute class (say that quickly three times). For this reason, the .NET Framework accepts either the full name of the attribute class, or an abbreviated version that leaves off the word "Attribute" from the end of the class name. For example, XmlRootAttribute becomes simply XmlRoot and XmlAttribteAttribute becomes XmlAttribute. The shortened version is used for the remainder of this chapter.
[XmlRoot(ElementName = "Link")] public class MenuLink { ... }
[XmlAttribute(AttributeName = "External")] public bool IsExternal { get { return _IsExternal; } set { _IsExternal = value; } }
The XmlAttribute attribute tells the XML serializer to serialize the property it is applied to as an attribute of the parent tag, rather than as its own element. If the AttributeName property is set, that value is used for the XML attribute name instead of the name of the property.
You might need to delete the existing link.xml file before you can overwrite it. Compile the project and run it.
With those two attributes applied and no other changes to the program, the following serialized XML now looks exactly like you want it to look. The root element is now called Link, and it has an attribute called External that stores the value of the IsExternal property:
<?xml version="1.0" encoding="utf-8"?> <Link xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
External="true"> <Url>http://www.gotdotnet.com/</Url> <Title>GotDotNet</Title> <ToolTip>Click here for Microsoft's GotDotNet developer site.</ToolTip> </Link>
Using XSD Schemas for Strongly Typed XML Serialization
The uses for XML serialization that you've examined so far work best when you have control over the XML structures involved, but what if you needed to write a program to consume and/or produce XML that conforms to a particular schema? You could always use the classes in the System.Xml namespace to manually manipulate the XML, and rely on your own understanding of the XSD Schema in question to make sure that you produce valid XML. Perhaps you've realized that it is possible to create some classes that, through the use of XML serialization attributes, can be automatically serialized to an XML structure that conforms to the XSD Schema, allowing you to work with the XML data in a purely object-oriented fashion.
Creating serializable classes manually is certainly possible, but might become tedious when you work with a large or complicated schema. This again relies on your thorough understanding of the requirements of the XSD Schema in question to ensure that your classes never serialize to a nonconformant XML structure. Fortunately, Microsoft provided a better solution in the form of a command-line utility called xsd.exe, which is provided with the .NET Framework SDK. This utility has numerous uses that are beyond the scope of this chapter. What we're interested in is its ability to take an XSD Schema as input and produce the source code to a set of classes that can deserialize from any XML that conforms to the XSD Schema, and serialize into XML that can be understood by any other tool that expects the format specified in the schema. The generated source code can be in any language you specify, as long as you specify C#, Visual Basic .NET, or JScript .NET.
This ability has tremendous potential. It makes it almost trivial to exchange data with a business partner in an industry-standard XML format without ever having to touch the XML directly in your code. Here's an example: Let's go to the source of the XSD standard, the World Wide Web Consortium (W3C). We'll use the canonical Purchase Order Schema sample found in Section 2.1 at http://www.w3.org/TR/xmlschema-0/.
For those of you who aren't sitting in front of a computer as you read this, the Purchase Order Schema is reproduced in full in Listing 10.1.
Listing 10.1 The Purchase Order Schema
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:annotation> <xsd:documentation xml:lang="en"> Purchase order schema for Example.com. Copyright 2000 Example.com. All rights reserved. </xsd:documentation> </xsd:annotation> <xsd:element name="purchaseOrder" type="PurchaseOrderType"/> <xsd:element name="comment" type="xsd:string"/> <xsd:complexType name="PurchaseOrderType"> <xsd:sequence> <xsd:element name="shipTo" type="USAddress"/> <xsd:element name="billTo" type="USAddress"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name="items" type="Items"/> </xsd:sequence> <xsd:attribute name="orderDate" type="xsd:date"/> </xsd:complexType> <xsd:complexType name="USAddress"> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="street" type="xsd:string"/> <xsd:element name="city" type="xsd:string"/> <xsd:element name="state" type="xsd:string"/> <xsd:element name="zip" type="xsd:decimal"/> </xsd:sequence> <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/> </xsd:complexType> <xsd:complexType name="Items"> <xsd:sequence> <xsd:element name="item" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="productName" type="xsd:string"/> <xsd:element name="quantity"> <xsd:simpleType> <xsd:restriction base="xsd:positiveInteger"> <xsd:maxExclusive value="100"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="USPrice" type="xsd:decimal"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name="shipDate" type="xsd:date" minOccurs="0"/> </xsd:sequence> <xsd:attribute name="partNum" type="SKU" use="required"/> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <!-- Stock Keeping Unit, a code for identifying products --> <xsd:simpleType name="SKU"> <xsd:restriction base="xsd:string"> <xsd:pattern value="\d{3}-[A-Z]{2}"/> </xsd:restriction> </xsd:simpleType> </xsd:schema>
Generating Classes from an XSD Schema
Creating a set of classes from this schema is easy:
Create a text file on your hard drive with the XSD Schema as the contents. (It's easiest to copy and paste from the W3C website.)
Rename the text file to PurchaseOrder.xsd.
Open a command prompt and navigate to the directory where you placed this file.
In order to use xsd.exe from the command prompt, the directory containing xsd.exe must be in your Path environment variable. If you have Visual Studio .NET installed, the Visual Studio .NET Command Prompt link located in the Visual Studio .NET Tools folder in your Start menu takes care of setting the proper environment variables for you. Enter the following command at the prompt:
xsd.exe PurchaseOrder.xsd /classes
This command generates a set of classes from the XSD file in the default language (C#). The generated classes are created in a file called PurchaseOrder.cs and looks something like what's shown in Listing 10.2.
Listing 10.2 C# Code Generated from PurchaseOrder.xsd
using System.Xml.Serialization; [System.Xml.Serialization.XmlRootAttribute("purchaseOrder", Namespace="",
IsNullable=false)] public class PurchaseOrderType { public USAddress shipTo; public USAddress billTo; public string comment; [System.Xml.Serialization.XmlArrayItemAttribute("item",
IsNullable=false)] public ItemsItem[] items; [System.Xml.Serialization.XmlAttributeAttribute(DataType="date")] public System.DateTime orderDate; [System.Xml.Serialization.XmlIgnoreAttribute()] public bool orderDateSpecified; } public class USAddress { public string name; public string street; public string city; public string state; public System.Decimal zip; [System.Xml.Serialization.XmlAttributeAttribute(DataType="NMTOKEN")] [System.ComponentModel.DefaultValueAttribute("US")] public string country = "US"; } public class ItemsItem { public string productName; [System.Xml.Serialization.XmlElementAttribute(DataType="positiveInteger")] public string quantity; public System.Decimal USPrice; public string comment; [System.Xml.Serialization.XmlElementAttribute(DataType="date")] public System.DateTime shipDate; [System.Xml.Serialization.XmlIgnoreAttribute()] public bool shipDateSpecified; [System.Xml.Serialization.XmlAttributeAttribute()] public string partNum; }
Using these classes, you can take any XML document that conforms to the Purchase Order Schema, such as the sample purchase order on the same page that the XSD Schema came from, and deserialize it into a PurchaseOrderType object by using the techniques introduced in the previous section. You can then manipulate the object and its properties just like you would with any other object, and serialize it to a new XML document that conforms to the original XSD Schema. You do this without ever touching the XML directly in your code. Let's do exactly that.
Manipulating XML Using Schema-Generated Classes
You are going to use the classes that xsd.exe just generated for you from PurchaseOrder.xsd to read and modify the sample purchase order listed on the same page of the W3C site that we got the XSD file from. This purchase order is reproduced in Listing 10.3.
Listing 10.3 The Purchase Order, po.xml
<?xml version="1.0"?> <purchaseOrder orderDate="1999-10-20"> <shipTo country="US"> <name>Alice Smith</name> <street>123 Maple Street</street> <city>Mill Valley</city> <state>CA</state> <zip>90952</zip> </shipTo> <billTo country="US"> <name>Robert Smith</name> <street>8 Oak Avenue</street> <city>Old Town</city> <state>PA</state> <zip>95819</zip> </billTo> <comment>Hurry, my lawn is going wild!</comment> <items> <item partNum="872-AA"> <productName>Lawnmower</productName> <quantity>1</quantity> <USPrice>148.95</USPrice> <comment>Confirm this is electric</comment> </item> <item partNum="926-AA"> <productName>Baby Monitor</productName> <quantity>1</quantity> <USPrice>39.98</USPrice> <shipDate>1999-05-21</shipDate> </item> </items> </purchaseOrder>
Use the classes generated for you by xsd.exe to programmatically add a new item to the purchase order:
-
Start by creating a new Console application called PurchaseOrder. Add a new XML file to the project called po.xml and enter the contents of Listing 10.3 into the file (or copy and paste the listing from the W3C website referenced earlier, if you can).
-
Add an existing item to the project and browse to the location where you saved the PurchaseOrder.cs file that was generated earlier by xsd.exe. If you want, create a new class file and enter the code from Listing 10.2 instead.
-
Open the Class1.cs file that was created for you with the project, and add the following using statements:
-
Inside the Main routine, enter the code to deserialize po.xml into a set of objects:
-
Create a new item to add to the purchase order. Referring back to Listing 10.2, you see that the generated class for holding items is called ItemsItem.
-
You need to add the newly created item to the list of items in the purchase order. Referring back to Listing 10.2, you see that the items are held in an array of type ItemsItem in the PurchaseOrderType class. What you must do is create a new ItemsItem array that's the same size as the original array, plus one. Then you must copy the current array into the new array and fill the empty slot with the newly created item.
-
All you need to do is serialize the modified PurchaseOrderType object back to XML. Create a new file so the original file is not overwritten.
-
You must do one more thing before you can compile the project and run it. The code you wrote expects to find po.xml in the current directory, but the directory that the code will be running from is actually the PurchaseOrder\bin\Debug subdirectory of the Project directory. Use Windows Explorer to copy po.xml into this directory. If the directory doesn't exist, you can click Build Solution in the Build menu to create it.
After everything is ready, compile and run the project. You should find a file called po_new.xml in the same \bin\Debug directory into which you just copied po.xml. Open it; the contents should match what's shown in Listing 10.4.
using System.IO; using System.Xml; using System.Xml.Serialization;
// deserialize the existing purchase order XmlSerializer serializer = new XmlSerializer(typeof(PurchaseOrderType)); FileStream fs = File.OpenRead(".\\po.xml"); PurchaseOrderType order; try { order = (PurchaseOrderType)serializer.Deserialize(fs); } finally { fs.Close(); }
Referring back to Listing 10.2, you can see that the first class listed is PurchaseOrderType. It has an XmlRoot attribute attached to it that tells you that this class will serialize to (and deserialize from) an XML root element called purchaseOrder. Because this class corresponds to the XML root element, it is the class that you pass to the XmlSerializer constructor. The rest of the deserialization code should look pretty familiar to you: You open the po.xml file, declare an instance of the PurchaseOrderType object, and fill it with the object returned from the XmlSerializer's Deserialize method.
ItemsItem newItem = new ItemsItem(); newItem.partNum = "352-AA"; newItem.productName = "Hedge Trimmer"; newItem.quantity = "1"; newItem.USPrice = 27.95m;
The m suffix on the value you're passing to newItem.USPrice simply tells the C# compiler to treat a literal number as a decimal data type because that's the type defined for the USPrice field of the ItemsItem class.
// create a new array of type ItemsItem to hold the current items plus the new one ItemsItem[] allItems = new ItemsItem[order.items.Length + 1]; // copy the current items into the new array Array.Copy(order.items, allItems, order.items.Length); // add our new item to the array allItems[allItems.Length - 1] = newItem; // set the order's item array to our new array order.items = allItems;
The last line in the previous code simply replaces the list of items in the purchase order with the new list.
// serialize the modified purchase order fs = File.Open(".\\po_new.xml", FileMode.OpenOrCreate); XmlTextWriter writer = new XmlTextWriter(fs, System.Text.Encoding.UTF8); writer.Formatting = Formatting.Indented; try { serializer.Serialize(writer, order); } finally { // close the XmlTextWriter, which closes the underlying
// stream writer.Close(); } Console.WriteLine("Purchase order modified!");
Listing 10.4 The Modified Purchase Order
<?xml version="1.0" encoding="utf-8"?> <purchaseOrder xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
orderDate="1999-10-20"> <shipTo> <name>Alice Smith</name> <street>123 Maple Street</street> <city>Mill Valley</city> <state>CA</state> <zip>90952</zip> </shipTo> <billTo> <name>Robert Smith</name> <street>8 Oak Avenue</street> <city>Old Town</city> <state>PA</state> <zip>95819</zip> </billTo> <comment>Hurry, my lawn is going wild!</comment> <items> <item partNum="872-AA"> <productName>Lawnmower</productName> <quantity>1</quantity> <USPrice>148.95</USPrice> <comment>Confirm this is electric</comment> </item> <item partNum="926-AA"> <productName>Baby Monitor</productName> <quantity>1</quantity> <USPrice>39.98</USPrice> <shipDate>1999-05-21</shipDate> </item> <item partNum="352-AA"> <productName>Hedge Trimmer</productName> <quantity>1</quantity> <USPrice>27.95</USPrice> </item> </items> </purchaseOrder>
As you can see, the new hedge trimmer has been added to the purchase order, but at no time did we manipulate any XML, or even the Document Object Model, directly.
Another feature of the xsd.exe utility bears mentioningit works in reverse as well. You can pass it a compiled assembly and the name of a type located in that assembly, and xsd.exe generates an XSD Schema from that class that conforms to any public properties or fields of the class, and any XML serialization-related attributes that have been applied to the class. The utility is described in the .NET Framework documentation if you're interested in this functionality.