18.6 -Validation
Whenever the MSXML parser loads an XML document, some level of validation must occur. Unless a particular document is a valid XML document, applications will not want to treat it as such. With MSXML, you can validate XML documents at different levels depending on the needs of your application. Specifically, you can use the methods of the DOMDocument40 or SAXXMLReader40 to validate an XML document against an XML schema. In both cases, you follow the same steps. Using the SOM to load an XML schema, you inform the parser about the specified schema documents and allow it to validate against the XML schema during parsing.
If your application already uses either the DOM or SAX2 to access XML documents, and you want to use the SOM from the same application, you can refer to the code in Listing 18.14. If you need to choose between adding either DOM or SAX2 code to your application to perform validation, validation using the DOM requires less code to achieve the same effect.
18.6.1 XML Document Samples
The sample address documents are shown in Listings 18.10 and 18.11. As you might expect, the document address.xml (18.10) conforms to the earlier XML schema, whereas badaddress.xml (18.11) does not. In particular, badaddress.xml does not contain the required customerID attribute on one of the businessCustomer elements.
LISTING 18.10 Sample document (address.xml)
<?xml version="1.0"?> <customerList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:cat= "http://www.XMLSchemaReference.com/Examples/Ch18" xsi:schemaLocation= "http://www.XMLSchemaReference.com/Examples/Ch18address.xsd"> <businessCustomer customerID="SAM132E57"> <name>Cliff Binstock</name> <phoneNumber>503-555-0000</phoneNumber> <address> <street>123 Gravel Road</street> <city>Nowheresville</city> <state>OR</state> <country>US</country> <zip>97000</zip> <effectiveDate>2001-02-14</effectiveDate> </address> </businessCustomer> <businessCustomer customerID="SAM132E58" primaryContact="Joe Sr."> <name>Joe Schmendrick</name> <phoneNumber>212-555-0000</phoneNumber> <phoneNumber>212-555-1111</phoneNumber> <URL>http://www.Joe.Schmendrick.name</URL> <address> <street>88888 Mega Apartment Bldg</street> <street>Apt 5315</street> <city>New York</city> <state>NY</state> <country>US</country> <zip>10000</zip> <effectiveDate>2001-02-14</effectiveDate> </address> </businessCustomer> <businessCustomer customerID="SAM132E58" primaryContact="Ellie" sequenceID="88742"> <name>Ellen Boxer</name> <phoneNumber xsi:nil="true"/> <address zipPlus4="20000-1234"> <POBox>123</POBox> <city>Small Town</city> <state>VA</state> <country>US</country> <zip>20000</zip> <effectiveDate>2001-02-14</effectiveDate> </address> </businessCustomer> <businessCustomer customerID="SAM132E59" primaryContact="Lydia" sequenceID="88743"> <name>Ralph McKenzie</name> <phoneNumber xsi:nil="true"/> <address> <street>123 Main Street</street> <pmb>12345</pmb> <city>Metropolis</city> <state>CO</state> <country>US</country> <zip>80000</zip> <effectiveDate>2001-02-14</effectiveDate> </address> </businessCustomer> <privateCustomer customerID="SAM01234P" sequenceID="88743"> <name>I. M. Happy</name> <phoneNumber>303-555-0000</phoneNumber> <phoneNumber>303-555-1111</phoneNumber> <address> <street>123 Main Street</street> <pmb>12345</pmb> <city>Metropolis</city> <state>CO</state> <country>US</country> <zip>80000</zip> <effectiveDate>2001-02-14</effectiveDate> </address> </privateCustomer> </customerList>>
LISTING 18.11 Sample document with errors (badaddress.xml)
<?xml version="1.0"?> <customerList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:cat= "http://www.XMLSchemaReference.com/Examples/Ch18" xsi:schemaLocation= "http://www.XMLSchemaReference.com/Examples/Ch18address.xsd"> <businessCustomer > <name>Cliff Binstock</name> <phoneNumber>503-555-0000</phoneNumber> <address> <street>123 Gravel Road</street> <city>Nowheresville</city> <state>OR</state> <country>US</country> <zip>97000</zip> <effectiveDate>2001-02-14</effectiveDate> </address> </businessCustomer> <businessCustomer customerID="SAM132E58" primaryContact="Joe Sr."> <name>Joe Schmendrick</name> <phoneNumber>212-555-0000</phoneNumber> <phoneNumber>212-555-1111</phoneNumber> <URL>http://www.Joe.Schmendrick.name</URL> <address> <street>88888 Mega Apartment Bldg</street> <street>Apt 5315</street> <city>New York</city> <state>NY</state> <country>US</country> <zip>10000</zip> <effectiveDate>2001-02-14</effectiveDate> </address> </businessCustomer> <privateCustomer customerID="SAM01234P" sequenceID="88743"> <name>I. M. Happy</name> <phoneNumber>303-555-0000</phoneNumber> <phoneNumber>303-555-1111</phoneNumber> <address> <street>123 Main Street</street> <pmb>12345</pmb> <city>Metropolis</city> <state>CO</state> <country>US</country> <zip>80000</zip> <effectiveDate>2001-02-14</effectiveDate> </address> </privateCustomer> </customerList>
18.6.2 Validation by Using the DOM
To validate an XML document against an XML schema, all that is required is to load that document with knowledge of the XML schema in question. To do that, it is first necessary to load the schema into the cache used by that DOMDocument40 component.
The cache in question is represented by the XMLSchemaCache40 component. The DOMDocument40 uses the cache of schemas located in its schemas property during parsing, so prior to loading the document you want to validate, set the schemas property to your already loaded XMLSchemaCache40. If the DOM is able to load the document and validate it, it is valid against all the schemas in the cache. The example in the next section shows how the DOM can be used to perform validation.
18.6.3 -Validation by Using SAX2
Validation with the DOMDocument40 component is simple and requires very little code. To use the SAX2 API to perform similar validation, additional code is required, but the complexity does not increase. All that is required to validate by using SAX2 is to define a relationship between the reader and the schema cache and to have content and error handlers in place during the parsing.
To use the schema cache with SAX, you must use the configuration methods of the SAX API. There is one property associated with XML schema validation, and one feature as well. The property is schemas, which must be set to the schema cache. The feature is schema-validation, which must be set to TRUE. When these are configured properly and the reader initiates a parse, the parser will validate against all the XML schemas in the cache. If the validation succeeds, the endDocument method of ContentHandler is fired. If validation fails, the error method of the ErrorHandler is fired.
The example application that follows uses SAX and DOM for validation, and provides sample code for those operations.