4.2 XPointer
XML schema documents use schemaLocation attributes to locate other schema documents and parts of schema documents. The value of a schemaLocation is always a URI, which may include an XPointer. The schemaLocation attribute type of schema, import, and include are examples of where an XPointer might locate a schema document location.
An XPointer is nominally an extension of an XPath. The XPointer Recommendation permitseven encouragesthe use of the XPath id function. There are also several XPointer specific extensions to XPath.
The XPointer Recommendation specifies expressions for returning portions of an XML document. The expression may evaluate to a node, a set of nodes, a portion of a node, or a portion of an XML document that spans nodes. Just as XML Schema limits the use of XPath expressions, it also limits the use of XPointer expressions. In particular, an XML schema may only import individual components (nodes) or sets of components (node sets). Because a component corresponds to an entire XML elementas opposed to a portion thereofthis section covers only XPointer constructs pertinent to extracting complete nodes.
4.2.1 Location Sets
The previous section mentions that an XPointer expression may evaluate to a node, a set of nodes, a portion of a node, or a portion of an XML document that spans nodes. Unlike an XPath expression, which must evaluate to a node set, an XPointer expression may theoretically return results that do not conform to a node. Therefore, the XPointer infrastructure requires an XPointer to return a location set.
A location set is an extension of a node set that an XPath normally returns. Each location in a location set is either a point or a range. A point consists of a node and an index. The index is a character offset into the node. A range consists of two points. The concepts of both point and range exist because of the XPointer requirement that an expression might return a subset of a node or possibly a set of characters that spans nodes.
Because an XML schema can only make use of an entire node or set of nodes, the remainder of this chapter covers only the subset of location information (and location sets) specified by nodes (and node sets). The terms 'node set' and 'location set' have similar meanings in the context of XML Schema.
4.2.2 Namespaces
The XPointer notation permits the identification of one or more namespaces. An XPointer expression specifies a namespace with the xmlns function. The argument to this function resembles a namespace attribute applicable to any XML element:
xmlns(xsd="http://www.w3.org/2001/XMLSchema")
The following XPointer locates the catalogEntryDescriptionType complex type by name. Namespace declarations always precede the locating expression:
http://www.XMLSchemaReference.com/theme/catalog.xsd# xmlns(xsd="http://www.w3.org/2001/XMLSchema") xpointer( xsd:schema/xsd:element[@name="catalogEntryDescriptionType"] )
In general, an XPointer may specify any number of namespaces. The previous example suffices for locating most schema components. An XPointer reference in a schemaLocation may require multiple namespaces when an XML document that is not a schema document has embedded Schema elements.
4.2.3 Subelement Sequences
Subelement sequences are notations that provide a shortcut to elements in an XML document. A subelement sequence has the following grammar:
bareName? ('/' [1-9] [0-9]*)+
where the optional bareName is replaced by the ID name of an element, as discussed in Section 4.2.2. The numerals represent the Nth subelement (counting from 1) at each level.
WARNING
Subelement sequences provide an extremely compact and convenient short notation. Unfortunately, this supported notation is highly susceptible to failure. In particular, the structure of the expected XML document (in this case, most likely an XML schema document) must be extremely stable. Any change in the element structure of the document can provide surprising results.
The following example locates the fourth subelement of the third subelement of the document root:
../some.xsd#/1/3/4
Similarly, the following example locates the fourth subelement of the third subelement of the element whose ID is 'yadayada':
../some.xsd#yadayada/3/4
4.2.4 XPointer Extensions to XPath
This section covers only those XPointer extensions applicable to XML schemas. In fact, only one extensionthe range-to functionhas any applicability with respect to XML schemas. The use of this function is not common.
The XPointer Recommendation adds the range-to function as an option for an XPath step. A step is an axis and a node. A convenient use of the range-to function is to locate a set of nodes to incorporate into a schema. The range-to function is likely to appear in conjunction with the XPath id function. The following example locates four schema components that appear in sequence in pricing.xsd: fullPriceType, freePriceType, salePriceType, and clearancePriceType.
http://www.XMLSchemaReference.com/theme/pricing.xsd# xpointer(id("fullPriceType.pricing.cType")/ range-to(id("clearancePriceType.pricing.cType")))
4.2.5 Using XPointer and XPath to Locate Schemas
This section describes the portions of the XPath Recommendation that apply to XPointers. An XPointer may reference any of the XPath axes touched on in Section 4.1.1. Table 4.3 lists all the axes supported by the XPath Recommendation and notes where each axis applies with respect to schema document locations. Such an XPointer might contain a reference to any axis; however, validations of many of these axes are likely to fail. Table 4.3 duly notes these likely failures in the Caveats column. A '3' indicates that an XPointer can reference the corresponding axis and expect positive results.
An XPath expression, in general, can return any node set, a Boolean value, a string, or a floating-point number. In an XML schema, the results of an XPath expression (in an identity constraint), as well as the results of an XPointer expression (in a schemaLocation value), must result in a node set. Because both expressions return a node set, this chapter does not go into detail describing the other result types (Boolean, string, and number). However, a predicate can refine a node set in an XPointer expression. Therefore, Table 4.4 provides a few examples of XPointers that contain predicates, which return values that are not node sets. This chapter does not provide a comprehensive tutorial on the full XPath expression options that may appear in a predicate. Note that each example is only the XPointer part of a URI. An entire URI that includes an XPointer has the following form:
http://www.example.com/some.xml#xpointer(exampleXPointer)
See Sections 4.1.3 or 4.2.4 for examples of complete URIs. The cells in the Example column of Table 4.4 provide a substitution for exampleXPointer in the previous code excerpt.
Table 4.4 provides a nice illustration of the power of XPointer expressions enhanced by XPath predicates. For a complete tutorial on predicates, refer to the XPath Recommendation.
TABLE 4.3 XPath Axes Potentially Used in schemaDocument References
Axis |
Meaning |
Caveats |
child |
All subelements of the context node |
|
descendant |
Element descendants of the context node |
|
parent |
The parent of the context node |
|
ancestor |
Element ancestors of the context node |
An XPointer referencing this axis is technically okay, but the schema is probably bizarre at best |
following-sibling |
All siblings of the context node that appear after the context node |
Not recommended, as this will include attribute and name-space nodes (that is, undesirable nodes that are just portions of an element). Use the following axis instead. |
preceding-sibling |
All siblings of the context node that appear before the context node |
Not recommended, as this will include attribute and name-space nodes (that is, undesirable nodes that are just portions of an element). Use the preceding axis instead. |
following |
Element siblings of the context node that appear after the context node |
|
preceding |
Element siblings of the context node that appear before the context node |
|
attribute |
Attributes of the context node |
Not recommended: If the XPointer returns anything, the validation will fail |
namespace |
Namespace nodes of the context node |
Not recommended: If the XPointer returns anything, the validation will fail. |
self |
The context node |
Why bother in an XPointer? |
descendant-or-self |
Descendant or self axes |
|
ancestor-or-self |
Ancestor or self axes |
TABLE 4.4 XPointer Examples
Example |
Meaning |
id("foo") |
The element whose ID is 'foo' (see Section 4.1.3 for a detailed example). |
id("foo")/range-to(id("bar")) |
The elements whose IDs are 'foo' and 'bar', as well as all elements in between. |
X[position()=1] or x[1] |
The first x.
|
X[position()>1] |
All x elements except the first one. |
X[5] |
The fifth x. |
X[last()] |
The last x. |
./x |
The x elements that are subelements of the parent of the current context. |
X[y="hello"] |
The x elements that have a subelement y whose value is 'hello'. |
X[@y and @z] |
The x elements that have both y and z attributes. |
X[@y or @z] |
The x elements that have either a y or z attribute. |
X[@y < "10"] |
The x elements that have a y attribute whose value is less than '10'. Note that XPath supports '=', '!=', '<', '<=', '>', and '>='. Escape these as necessary with the standard XML entity reference, such as '<' for the less-than character. |
X[starts_with(y,"abc")] |
The x elements that contain a y subelement whose value starts with 'abc'. |
X[sum(y) > 50] |
The x elements that contain one or more y subelements, the sum of whose values is greater than 50. |