XSLT
Listing 3 shows the XSLT used to generate the HTML in Listing 2. Because XSLT is an XML vocabulary, its structure should be familiar to you. As you examine it, notice that the element names fall into two categories—those that begin with the prefix xsl: and those that don’t. The elements that begin with xsl: are part of the XSLT vocabulary; all the other elements are HTML (<html>, <body>, <head>, <h1>, and so on).
Namespaces
For our parser to distinguish between different vocabularies, we need to understand namespaces. XML namespaces is an official W3C recommendation that uses Uniform Resource Identifiers (URIs) as the basis for distinguishing different XML vocabularies. URIs are a set of character strings, defined by a generic URI syntax, used for identifying resources.
Since URIs are difficult to write and may contain characters that violate XML element-naming rules, XML namespaces permits the use of shorthand prefixes to stand in for a URI. Thus, while the official XSLT namespace is http://www.w3.org/1999/XSL/Transform, common practice is to use the prefix xsl:. In our XSLT example, lines 2–3 establish this relationship so that, when an XSLT processor sees the xsl: prefix, it knows to process the element as XSLT. Figure 4 shows how to interpret the namespace declaration in our example.
Figure 4 Defining a namespace reference for XSLT elements.
Deconstructing XSLT
So now that we can use namespaces to identify the XSLT elements, let’s walk through the code in Listing 3 to see how XSLT and XPath combine to generate our HTML.
- Line 5: Here we inform the XSLT processor that we will be outputting HTML. This helps set up the output correctly, since the output of XSLT can be just about anything. Options here include HTML, XML, or text.
- Line 7: <xsl:template match="/"> is the beginning of an xsl:template (rule). An XSLT processor works by trying to match templates against the XML. If a match occurs, the content of the template element is generated as output. In this template, we’re matching against /, which is XPath shorthand for the root of the document (refer to Figure 2). Since every document has a root, this rule always matches, and the content of this element is generated.
- Lines 8–16: These lines specify the output given a successful match. Look carefully and you should be able to see that what’s being generated is the HTML for the web page we’re targeting, along with some XSLT inserted where we want details of the most popular books. As the HTML is generated, the XSLT on lines 17–22 is used to dip back into the XML and extract the book title, author, and ISBN, filling out the HTML list.
- Lines 17–22: Here we see the XSLT that generates the
actual list of favorite books. To loop over all the book titles in our XML, we
use the powerful xsl:for-each element, which works just like a
for loop in C++ or Java:
<xsl:for-each select="/books/book/title">
- Here our XPath select phrase (written as the value of the select attribute) tells XSLT to search from the root, find all books elements, find all their book children, and then collect all their title children. Behind the scenes, the XSLT processor gathers all the title elements and applies lines 18–21 to each one.
- Lines 18–21: For each title located in our XML, we
want to generate a single HTML list item of the following form:
<li> ... </li>
- Within each list item, we want to display the title, the last name of the author (in parentheses), and the ISBN. To accomplish this objective, we need to switch back to XSLT. However, keep in mind that since we’re technically within the context of one iteration of the XSLT for-each loop, when we ask for title, author, and ISBN our request will be relative to our current working title element (which is what we asked for in select="/books/book/title").
Because we’re already at a title element, we can obtain its value (the title of the book) by using this expression:
<xsl:value-of select="."/>
where the dot (".") is an XPath expression that means the value of the current node.
To obtain the last name of the author, though, we have to navigate back up the tree to the book element and then down to author/lname. Just like in UNIX or DOS, where .. is used to navigate up a directory hierarchy, XPath uses .. to move up the XML node hierarchy. Thus, to display the last name from our title context and surround the value with parentheses, we use this:
(<xsl:value-of select="../author/lname"/>)
To extract the isbn attribute value, which is part of the book element, we again need to go back up the tree (..) from our title element. To get the value of an attribute that’s part of an element, we use @ followed by the attribute’s name. The final piece to our dynamic output is the following:
<xsl:value-of select="../@isbn"/>
XPath Overview
XSLT makes use of XPath expressions to specify which parts of the input XML document to match against. In XSLT, whenever you see select = " .. ", the phrase inside the parentheses is XPath. We’ve already seen select ="/" (which matches against the root of a document). Additional slashes in XPath, much like the navigation through a file directory structure, move us down through child nodes in a tree.
Table 1 extends our knowledge of XPath by showing some matching expressions and how they’re commonly used in XSLT.
Table 1 Some XPath examples.
XPath Expression |
What’s Matched |
Comments |
/ |
Root. |
Every XML document has a root that conceptually sits above the top-level element, so this match always works. |
/books |
The element books that is our top-level element. |
Since there can only be one top-level element, only one element is returned |
//book |
All the book elements that appear anywhere in the document. |
In our example, book only appears as a child of books. If our document were more complex, with book elements appearing elsewhere, this XPath match would find them all. |
//book/title |
All the title elements that are children of book, where book can appear anywhere in the document. |
|
/*/*/title |
All the title elements that are two levels down from the top of the document. |
Using the asterisk or star (*) selects every element that is within the preceding path. |
//@isbn |
Matches the attribute isbn in any tag in the document. |
The @ symbol refers to attributes. |
//book/@isbn |
Matches the attribute isbn only when it appears in the book element, which can be anywhere in the document. |
|
//@* |
Matches every attribute in the entire document |
* acts as a wildcard in XPath. |
//book[@isbn] |
Matches all the book elements that have an attribute named isbn. |
Using brackets [..] acts as a qualifier. Here we’re using attribute values to control the selection of elements. With our sample data, all the book elements are returned in the match. |
//book[@isbn="0-321-33019-6"] |
Matches only the book elements that have this isbn value. |
Here only one book element is returned. |