- Introduction
- Writing a Custom Schema
- Implementing XML Schema Types
- From Here...
Writing a Custom Schema
The best way of understanding XML Schema is by writing a simple schema with which you can constrain a document. This example doesn't explore any of the type-system features of XML Schema; we'll just concentrate on constraining a document as a first step. Listing 1 recaps the document that we want to constrain.
Listing 1 An XML Document Containing DVD Information
<?xml version="1.0" encoding="utf-8"?> <dvd xmlns="http://dvd.example.com" region="2"> <title>The Phantom Menace</title> <year>2001</year> </dvd>
The document in Listing 1 contains an element called dvd, which itself contains two elements, title and year, which are all qualified with the namespace http://dvd.example.com. From this, we know immediately that the targetNamespace is http://dvd.example.com. We also know that the schema requires two nested elements and a globally scoped element. Now we can construct the schema shown in Listing 2.
Listing 2 DVD Schema
<?xml version="1.0" encoding="UTF-8"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://dvd.example.com" elementFormDefault="qualified" attributeFormDefault="unqualified"> <element name="dvd"> <complexType> <sequence> <element name="title" type="string"/> <element name="year" type="positiveInteger"/> </sequence> <attribute name="region" type="positiveInteger"/> </complexType> </element> </schema>
Because the elements in the document in Listing 1 have a namespace that matches the targetNamespace of the schema in Listing 2, we can assume that the document is a valid instance of the schema.
The schema body begins by dictating (in the line <element name="dvd">) that the instance document must have an opening element with the name dvd. The schema then goes on to declare that there must be a sequence of two nested elements within that first dvd element, called title and year. This specification takes four elements:
The complexType element indicates that the parent dvd element consists of other elements nested within it.
Inside the complexType element is a sequence element. A sequence element places the constraint on any conformant document that elements nested within must follow the same sequence as the schema. In this case, because the elements nested within the sequence are the title element followed by the year element, conformant documents must also specify title before year.
The title element must contain information in string form because its type attribute is set to the string type from the XML Schema namespace.
Similarly, the year element specifies that its information must be encoded as an XML Schema positiveInteger type.
The final aspect of this schema specifies that the outermost dvd element requires an attribute to hold region information. This constraint is applied with the <attribute> element, which mandates an attribute called region whose value must be of type positiveInteger.
While we can now begin to create simple schemas to constrain simple documents, scaling this approach to large schemas and large documents is usually impractical and undesirable. Instead, we need to look beyond the documentwhich is only the serialized, readable form of XMLto the real power of XML Schema: its type system.