- Introducing MSXML
- Concepts and Observations
- XML Schema Examples
- MSXML Fundamentals
- Schema Object Model (SOM)
- Validation
- Example: XML Schema Tree
18.5 -Schema Object Model (SOM)
So far, this chapter has focused on the generic approach that MSXML provides for accessing XML schemas. This consists of accessing the XML schema documents directly as ordinary XML documents. In this manner, you can work with them programmatically, using the same DOM you use for any other document. This approach is limited, however, because you are working with the least common denominator. XML schemas have a higher structure that has already been defined for us, so it makes sense to use a higher level API to access them if one is available.
Beginning with version 4.0 of MSXML, this higher level API is available and is called the "Schema Object Model" (SOM). The SOM is a set of components and interfaces that represent the compiled structure of an XML schema rather than just the structure of its XML.
18.5.1 SOM Fundamentals
The SOM can be used to load an XML schema document, and that document is then loaded into a tree structure that closely models the relationships of an XML schema document. Rather than a DOM-based tree that treats the elements of the schema document as normal XML, the SOM understands the Schema-based relationships between the parts of a schema document and models those relationships in the SOM structure. For example, XML schema document element elements are represented by the ISchemaElement, which has a type property. That property corresponds to an XML Schema document simpleType or complexType element, and that is represented in the SOM by the ISchemaType interface. This compiled model makes the SOM much more attractive than the DOM for developers that need to examine XML schemas. Table 18.3 describes how each interface in the model relates to an XML schema, but it does not illustrate how some of the interfaces relate.
TABLE 18.3 Schema Object Model Interfaces
Interface |
Description |
ISchema |
schema element |
ISchemaAny |
any element |
ISchemaAttribute |
attribute element |
ISchemaAttributeGroup |
attributeGroup element |
ISchemaComplexType |
complexType element |
ISchemaElement |
element element |
ISchemaIdentityConstraint |
complexType element |
ISchemaItem |
Base interface |
ISchemaItemCollection |
Collection of SOM objects |
ISchemaModelGroup |
modelGroup element |
ISchemaNotation |
notation element |
ISchemaParticle |
Piece of a modelGroup element |
ISchemaStringCollection |
Collection of strings in the SOM |
ISchemaType |
simpleType element or complexType element |
Many of the SOM interfaces inherit from other SOM interfaces. Figure 18.3 shows where that inheritance occurs.
To help explain the SOM, we can take a look at some of the principal interfaces that make up the SOM. The first of those interfaces is ISchemaItem, the foundation of all the SOM interfaces.
This chapter does not provide a complete reference to all the properties and methods of the Schema Object Model. For a complete list, see the documentation provided with the MSXML 4.0 SDK.
18.5.2 The ISchemaItem Interface
In most object models, you will find some root class or interface from which all the objects in the system inherit. In Java, this root is the Object class. In the .NET Framework, the same is true: A root class called "Object" exists for all object classes. Within the framework of the SOM, all interfaces inherit from the ISchemaItem interface. Through the common properties of the ISchemaItem interface, you can learn a great deal about any of the component instances in the model. Table 18.4 lists the properties of the ISchemaItem interface.
FIGURE 18.3 SOM diagram.
TABLE 18.4 Property Summary of the ISchemaItem Interface
Property |
Description |
Id |
Value of id attribute of the element |
itemType |
Constant defining the type of the object. |
Name |
Item name. |
namespaceURI |
URI of associated namespace. |
Schema |
ISchema of this model. |
writeAnnotation |
Writes top-level annotations into an output document. |
unhandledAttributes |
-Any attributes not defined in the XML Schema Recommendation. |
The ISchemaItem interface provides two properties you might find yourself relying upon. The first is the Schema property, which returns the ISchema interface for the XML schema of which this particular item is a part. This means that from any item in the SOM, you can reach the root. This is not true, however, for built-in types and the ISchema interface itself (to avoid a circular reference problem).
The second property is the itemType property, which returns one of a list of constants provided in the SOM. These constants define the type of item this instance of ISchemaItem represents (for example, built-in type, complex type, attribute, notation). These constants are contained in the SOMITEM_TYPE enumeration.
Some of the SOMITEM_TYPE constants are used alone to indicate the type of the object in question, whereas others are combined to create new values that determine the type. For example, SOMITEM_SCHEMA indicates that an item is a schema and therefore is also an ISchema interface. On the other hand, SOMITEM_DATATYPE and SOMITEM_DATATYPE_BOOLEAN are combined in a bit-mask to indicate a built-in Boolean datatype. With one or more of these constants, all the types can be expressed.
18.5.3 -The ISchema Interface
While the ISchemaItem is the foundation of the SOM, the ISchema interface is at the root of the schema model itself. The ISchema interface of the Schema Object Model directly corresponds to the schema element of the XML schema. Through the ISchema interface, you have access to all the other elements that make up the schema.
To get an ISchema interface from an XML document, you use a utility component included in MSXML, XMLSchemaCache40. XMLSchemaCache40 functions as a map of target namespaces to valid XML schemas. When the add method of XMLSchemaCache40 is called, as shown in Listing 18.8, the XML schema document is read to make sure it conforms to the XML Schema Recommendation. If it does, it is loaded into the cache. If it does not, an error is thrown by the add method.
LISTING 18.8 Getting the ISchema Interface
Dim schemaCache As XMLSchemaCache40 Set schemaCache = New XMLSchemaCache40 schemaCache.Add "http://www.XMLSchemaReference.com/examples/theme/addr", _ "c:\temp\address.xsd" Dim schema As ISchema Set schema = schemaCache.getSchema(_ "http://www.XMLSchemaReference.com/examples/theme/addr")
The add method in the preceding code takes two parameters: the namespace to associate with the XML schema and the XML schema itself (either as a URL or as a DOMDocument). Typically, the namespace you would associate with the XML schema would be the targetNamespace of the XML schema document, but the SOM does not enforce this relationship. Once we have an ISchema interface, we have access to all the parts that make up an XML Schema. These are provided as properties of the ISchema interface, as listed in Table 18.5.
Some of the properties listed in Table 18.5 have values that are nothing more than strings that provide access to information such as the version or namespace. However, properties that allow programmatic access to the remainder of the XML schema's components are more complex than that. All the other seven properties represent collections of objects, such as the attributes or structure types. Because the SOM is implemented as a COM programming interface, these are COM collections that are easily traversed by using just a few lines of Visual Basic code. Each of these collections is represented as an ISchemaItemCollection interface.
Listing 18.9 illustrates how to traverse one of the collections provided by the ISchema interface. In this instance, a For . . . Each loop cycles through all the elements in the XML schema and prints the output to the debugger.
LISTING 18.9 Walking the SOM Elements Collection
Dim elem As ISchemaElement For Each elem In schema.elements Debug.Print elem.Name Next elem
TABLE 18.5 Property Summary of the ISchema Interface
Property |
Description |
attributeGroups |
Collection of ISchemaAttributeGroup |
attributes |
Collection of ISchemaAttribute |
elements |
Collection of ISchemaElement |
modelGroups |
Collection of ISchemaModelGroup |
notations |
Collection of ISchemaNotation |
schemaLocations |
Location of linked schemas |
targetNamespace |
String, target namespace for the XML schema |
types |
Collection of ISchemaType |
version |
String version of the XML schema |
18.5.4 -DOM versus SOM
We have worked with XML documents by using both the DOM and the Schema Object Model (SOM), and each model has limitations. Because of their tight integration in the MSXML components, separating the core MSXML components from SOM components and deciding which approach is needed for a particular problem is difficult.
Obviously, the SOM is designed to read XML schema documentsand only XML schema documents. The DOM, on the other hand, works with any well-formed XML document. In cases where an application must examine an XML schema document, it makes sense to use the SOM, the higher-level API, to determine the components of the schema.
This advantage of the SOM, however, applies only to the reading of an XML schema. As mentioned earlier, the SOM is a read-only representation of an XML schema, loaded from a document. The SOM cannot be used to create or modify an XML schema document, only to examine it. If your application must create an XML schema document or modify an existing XML schema document, the application should use the DOM or SAX, or perhaps even a transformation for that purpose.
18.5.5 -Creating XML Schemas
From an application, you are most likely going to programmatically access an XML schema to read its contents. XML schema documents are typically written to function as a binding contract; as such, their creation typically involves collaboration between developers and designers and even outside parties. That creating or modifying a schema programmatically is the exceptionand reading a schema is the ruleis evident in that the SOM provides no mechanism for performing either creation or modification. The SOM is only a tool for reading XML schemas and using them to validate an XML instance.
Of course, we can still create an XML schema document with MSXML. There are cases where creating or modifying an XML schema might be necessary. For example, an application that exports a database schema to an XML schema document needs to create an XML schema document. In those cases, developers must return to the standard APIs for XML. The DOMDocument40 component provides not only the capability to read an XML document and traverse the tree, but also to create and modify documents.