XML Schema: An Overview
This article provides a high-level overview of W3C XML Schema. It describes the basic features of XML Schemas and provides examples.
Example 1 shows a small XML document. It consists of an employee element that has two child elements (number and status) as well as one attribute (hireDate).
Example 1. Employee Document
<employee hireDate="2001-04-02"> <number>557</number> <status>FT</status> </employee>
The DTD shown in Example 2 can be used to validate that the employee element contains a number element and an optional status element, in that order. It can also ensure that the required attribute hireDate is present. DTDs are useful for their simple, compact syntax and their wide support in XML processors. However, DTDs also have a number of weaknesses. They cannot enforce the data types of the elements; ensuring, for example, that the hireDate attribute contains a valid date. They also do not support namespaces, and have their own non-XML syntax that is not extensible.
Example 2. Employee DTD
<!ELEMENT employee (number, status?)> <!ELEMENT number (#PCDATA)> <!ELEMENT status (#PCDATA)> <!ATTLIST employee hireDate CDATA #REQUIRED>
An XML Schema such as the one shown in Example 3 can also be used to validate the XML document. It declares the three elements (employee, number and status) and the attribute hireDate. It also assigns data types to each of the elements and attributes.
Example 3. Employee Schema
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="employee" type="EmployeeType"/> <xsd:complexType name="EmployeeType"> <xsd:sequence> <xsd:element name="number" type="xsd:integer"/> <xsd:element name="status" type="StatusType" default="FT"/> </xsd:sequence> <xsd:attribute name="hireDate" type="xsd:date"/> </xsd:complexType> <xsd:simpleType name="StatusType"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="FT"/> <xsd:enumeration value="PT"/> </xsd:restriction> </xsd:simpleType> </xsd:schema>
Benefits of Schemas
The World Wide Web Consortium (W3C) began work on XML Schema in 1998, and the first version became official in May 2001. The intent was to create a schema language that is more expressive than DTDs, supports namespaces, and unlike DTDs, uses XML syntax.
Validation of XML documents is an important use for schemas, and it provides features for verifying the following:
The structure of elements and attributes. For example, an employee must have a number, and may optionally have a hire date and a status.
The order of elements. For example, number must appear before status.
The data values of attributes and elements, based on ranges, enumerations, and pattern matching. For example, employee status must be either "FT" or "PT", and hire date must be a valid date.
The uniqueness of values in an instance. For example, all employee numbers in a document must be unique.
Schemas serve other purposes as well. They can be used as system documentation, and can be shared with your trading partners to be used as a "contract" for e-commerce. Schema processing can also be used to add default information to XML documents and to normalize white space by element type. Finally, schemas can contain information that is useful to an application during processing of an XML document. For example, a schema may indicate the mapping between an XML element and a database table, signaling the processor to update a particular table with the contents of that element.