Home > Articles > Web Services > XML

Sams Teach Yourself XML in 21 Days

Sep 16, 2005

📄 Contents

␡

⎙ Print

< Back Page 111 of 288 Next >

Recommended Book 

Sams Teach Yourself XML in 21 Days, 3rd Edition

Learn More Buy

Working with the `select` Attribute and XPath

You can assign the select attribute XPath expressions, which are used to indicate exactly what node or nodes you want to use in an XML document. XPath has been a W3C recommendation since November 16, 1999. You can find the XPath recommendation—for the current version, 1.0—at http://www.w3.org/TR/xpath. Version 2.0 of XPath is on the way, and it's currently in working draft form at this point; see http://www.w3.org/TR/xpath20. (Very little software supports XPath 2.0 yet; the Saxon XSLT processor—at http://saxon.sourceforge.net—provides some support for it.)

XPath expressions are more powerful than the match expressions you've already seen; for one thing, they're not restricted to working with the current node or direct child nodes; you can use them to work with parent nodes, ancestor nodes, and more.

To specify a node or set of nodes in XPath, you use a location path. A location path consists of one or more location steps, separated by / (to refer to a child node) or // (to refer to any descendant node). If you start the location path with /, the location path is called an absolute location path because you're specifying the path from the root node; otherwise, the location path is relative. And the node an XPath expression is working on is called the context node.

Location steps are made up of an axis, a node test, and zero or more predicates. For example, in the expression child::state[position() = 2] (which picks out the second <state> child of the context node), child is the name of the axis, state is the node test, and [position() = 2] is a predicate. You can create location paths with one or more location steps. For example, /descendant::state/child::name selects all the <name> elements that have a <state> parent. You'll get the details about what kind of axes, node tests, and predicates XPath supports in the following sections.

Using Axes

In the location step child::bird, which refers to a <bird> element that is a child of the current node, child is called the axis. XPath supports many different axes, and it's important to know what they are. Here's the list:

ancestor — This axis contains the ancestors of the context node. An ancestor node is the parent of the context node, the parent of the parent, and so forth, back to (and including) the root node.
ancestor-or-self — This axis contains the context node and the ancestors of the context node.
attribute — This axis contains the attributes of the context node.
child — This axis contains the children of the context node.
descendant — This axis contains the descendants of the context node. A descendant is a child or a child of a child and so on.
descendant-or-self — This axis contains the context node and the descendants of the context node.
following — This axis contains all nodes that come after the context node.
following-sibling — This axis contains all the following siblings of the context node.
namespace — This axis contains the namespace nodes of the context node.
parent — This axis contains the parent of the context node.
preceding — This axis contains all nodes that come before the context node.
preceding-sibling — This axis contains all the preceding siblings of the context node.
self — This axis contains the context node.

Note that although the match attribute can only use the child or attribute axes in location steps (that's the major restriction on the match attribute compared to the select attribute), the select attribute can use any of the 13 axes. (The term sibling in XML refers to an item on the same level as the current item.)

For example, this template extracts the value of the <name> element by using the location path child::name:

<xsl:template match="state">
    <HTML>
        <BODY>
            <xsl:value-of select="child::name"/>
        </BODY>
    </HTML>
</xsl:template>

This is really the same as the version you've already been using because, as mentioned, you can abbreviate it by omitting the child:: part:

<xsl:template match="state">
    <HTML>
        <BODY>
            <xsl:value-of select="name"/>
        </BODY>
    </HTML>
</xsl:template>

In the location step child::name, child is the axis and name is the node test, which is described in the following section.

Using Node Tests

After you specify the axis you want to use in a location step, you specify the node test. A node test indicates what type of node you want to match. You can use names of nodes as node tests, or you can use the wildcard * to select element nodes. For example, the expression child::*/child::flower selects all <flower> elements that are grandchildren of the current node. Besides nodes and the wildcard character, you can also use these node tests:

comment() — This node test selects comment nodes.
node() — This node test selects any type of node.
processing-instruction() — This node test selects a processing instruction node. You can specify, in the parentheses, the name of the processing instruction to select.
text() — This node test selects a text node.

Using Predicates

The last part of a location step is the predicate. In a location step, the (optional) predicate narrows the search down even more. For example, the location step child::state[position() = 1] uses the predicate [position() = 1] to select not just a child <state> element but the first <state> child element.

Predicates can get pretty involved because there are all kinds of XPath expressions that you can work with in predicates. And there are various types of legal XPath expressions; here are the possible types:

Booleans
Node sets
Numbers
Strings

The following sections look at how expressions help you in XSLT.

Boolean Expressions

XPath Boolean values are true/false values, and you can use the built-in XPath logical operators to produce Boolean results. These are the logical operators:

!= — This stands for "is not equal to."
< — This stands for "is less than." (You use < for this in XML documents.)
<= — This stands for "is less than or equal to."
= — This stands for "is equal to."
> — This stands for "is greater than."
>= — This stands for "is greater than or equal to."

For example, here's how to use a logical operator to match all <state> elements after the first three, using the position() function (which you'll see in the next section):

<xsl:template match="state[position() > 3]">
    <xsl:value-of select="."/>
</xsl:template>

You can also use the keywords and and or to connect Boolean expressions. The following example selects all <state> elements after the first three and before the tenth one:

<xsl:template match="state[position() > 3 and position() < 10]">
    <xsl:value-of select="."/>
</xsl:template>

In addition, you can use the not() function to reverse the logical sense of an expression. The following example selects all <state> elements except the last one, using the last() function (which you'll see in the next section):

<xsl:template match="state[not(position() = last())]">
    <xsl:value-of select="."/>
</xsl:template>

Node Sets

Besides Boolean values, XPath can also work with node sets. A node set is just a set of nodes. By collecting nodes into a set, XPath lets you work with multiple nodes at once. For example, the location step child::state/child::bird returns a node list of all <bird> elements that are children of <state> elements.

You can use various XPath functions to work with node sets. For example, the last() function picks out the last node in the node set. The following are the node set functions:

last() — Returns the number of nodes in the node set.
position() — Returns the position of the context node in the node set. (The first node is Node 1.)
count( node-set ) — Returns the number of nodes in node-set.
id( ID ) — Returns a node set that contains the element whose ID value matches ID.
local-name( node-set ) — Returns the name of the first node in node-set.
namespace-uri( node-set ) — Returns the URI of the namespace of the first node in node-set.
name( node-set ) — Returns the qualified name of the first node in node-set.

Some of these functions can be very useful. For example, you can number the states in the XML sample from earlier today by using the position() function, as shown in Listing 9.12.

Example 9.12. An XSL Style Sheet That Uses `position()` (`ch09_12.xsl`)

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:template match="states">
        <HTML>
            <HEAD>
                <TITLE>
                    The States
                </TITLE>
            </HEAD>
            <BODY>
                <H1>
                    The States
                </H1>
                <xsl:apply-templates select="state"/>
            </BODY>
        </HTML>
    </xsl:template>

    <xsl:template match="state">
        <P>
            <xsl:value-of select="position()"/>.
            <xsl:value-of select="name"/>
        </P>
    </xsl:template>

</xsl:stylesheet>

Here's what an XSLT processor produces when you use this style sheet on the sample XML document:

<HTML>
    <HEAD>
        <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <TITLE>
            The States
        </TITLE>
    </HEAD>

    <BODY>
        <H1>
            The States
        </H1>
        <P>1. California</P>
        <P>2. Massachusetts</P>
        <P>3. New York</P>
    </BODY>
</HTML>

Note that the states are indeed numbered. Also, as with today's other examples, the whitespace and indenting here have been cleaned up. Figure 9.5 shows the result of this transformation.

Figure 9.5 Numbering items by using XSLT.

When you're working on the nodes in a node set, you can use functions such as position() to target specific nodes. For example, child::state[position() = 1] selects the first <state> child of the node, where you apply this location step, and child::state[position() = last()] selects the last.

Numbers

XPath can use numbers in expressions (for example, the 1 in the expression child::state[position() = 1]). There are also some operators that you can use to work with numbers:

+ — Addition.
- — Subtraction.
* — Multiplication.
div — Division. Note that the / character that stands for division in other languages is used for other purposes in XML and XPath.
mod — Modulus. This operation returns the remainder after one number is divided by another.

For example, if you use <xsl:value-of select="2 + 2"/>, you get the string "4" in the output document. The following example selects all states that have at least 200 people per square mile:

<xsl:template match="states">
    <HTML>
        <BODY>
            <P>
               <xsl:apply-templates select="state[population div area > 200]"/>
            </P>
        </BODY>
    </HTML>
</xsl:template>

Besides the numeric operators, XPath also has these functions that work with numbers:

ceiling() — Returns the smallest integer larger than the number you pass in the parentheses. For example, ceiling(4.6) returns 5.
floor() — Returns the largest integer smaller than the number you pass it. For example, floor(4.6) returns 4.
round() — Rounds the number you pass it to the nearest integer. For example, round(4.6) returns 6.
sum() — Returns the sum of the numbers you pass it.

For example, here's how to find the total population of the states in ch09_01.xml by using sum():

<xsl:template match="states">
    <HTML>
        <BODY>
            <P>

                   The total population is:

                   <xsl:value-of select="sum(child::population)"/>
            </P>
        </BODY>
    </HTML>
</xsl:template>

Strings

Strings in XPath are treated as Unicode characters. A number of XPath functions are specially designed to work on strings. Here they are:

concat( string1, string2 , ...) — Returns the strings joined together.
contains( string1, string2 ) — Returns true if the first string contains the second one.
format-number( number1, string2, string3 ) — Returns a string that holds the formatted string version of number1 , using string2 as a formatting string, and string3 as an optional locale string. (You create formatting strings as you would for Java's java.text.DecimalFormat method.)
normalize-space( string1 ) — Returns string1 after stripping leading and trailing whitespace and replacing multiple consecutive empty spaces with a single space.
starts-with (string1, string2 ) — Returns true if the first string starts with the second string.
string-length( string1 ) — Returns the number of characters in string1 .
substring( string1, offset, length ) — Returns length characters from the string, starting at offset .
substring-after( string1, string2 ) — Returns the part of string1 after the first occurrence of string2 .
substring-before( string1, string2 ) — Returns the part of string1 up to the first occurrence of string2 .
translate( string1, string2, string3 ) — Returns string1 with all occurrences of the characters in string2 replaced with the matching characters in string3 .

Now you know what items can go into location steps—axes, node tests, and predicates. XPath syntax is far from intuitive, so let's see some more examples as you take a look at XPath abbreviations and default rules.

XPath Abbreviations and Default Rules

So far you have specifically indicated what axis you wanted to use when writing location steps, but there are ways to abbreviate location steps to make things easier. For example, as mentioned earlier, the location step child::state points to a <state> element that is a child element of the context node, but you can abbreviate that location step simply as state. These are the legal abbreviations:

Location Step	Abbreviation
`self::node()`	`.`
`parent::node()`	`..`
`child::` `childname`	`childname`
`attribute::` `childname`	`@` `childname`
`/descendant-or-self::node()/`	`//`

You can also abbreviate predicate expressions. For example, you can abbreviate [position() = 8] as [8].

Here are some examples of location paths using abbreviated syntax:

* — Matches all element children of the context node.
*/*/state — Matches all <state> great-grandchildren of the context node.
. — Matches the context node.
.. — Matches the parent of the context node.
../@units — Matches the units attribute of the parent of the context node.
.//state — Matches all <state> element descendants of the context node.
//state — Matches all <state> descendants of the root node.
//state/name — Matches all <name> elements that have a <state> parent.
/states/state[4]/name[3] — Matches the third <name> element of the fourth <state> element of the <states> element.
@* — Matches all the attributes of the context node.
@units — Matches the units attribute of the context node.
state — Matches the <state> element children of the context node.
state[@nickname and @units] — Matches all the <state> children of the context node that have both a nickname attribute and a units attribute.
state[@units = "people"] — Matches all <state> children of the context node that have a units attribute that has the value "people".
state[7] — Matches the seventh <state> child of the context node.
state[7][@units = "people"] — Matches the seventh <state> child of the context node if that child has a units attribute with the value "people".
state[last()] — Matches the last <state> child of the context node.
state[name] — Matches the <state> children of the context node that themselves have <name> children.
state[name="Massachusetts"] — Matches the <state> child nodes of the context node that have <name> children whose text value is "Massachusetts".
states//state — Matches all <state> element descendants of the <states> element children of the context node.
text() — Matches all child text nodes of the context node.

Listing 9.13 shows an example that uses abbreviated syntax. This example picks out the state bird for each state and lists it by using text such as "The Quail is the California state bird." When you're inside a <state> element's <bird> template, you can reach the <name> element of the state by using ../name, as shown in this example.

Example 9.13. An XSL Style Sheet That Uses Abbreviated Syntax (`ch09_13.xsl`)

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:template match="states">
        <HTML>
            <BODY>
                <xsl:apply-templates select="state"/>
            </BODY>
        </HTML>
    </xsl:template>

    <xsl:template match="state">
        <P>
            <xsl:apply-templates select="bird"/>
        </P>
    </xsl:template>

    <xsl:template match="bird">
        The <xsl:value-of select="."/>
        is the <xsl:value-of select="../name"/>
        state bird.
   </xsl:template>

</xsl:stylesheet>

Here are the results of applying this style sheet to the sample XML document:

<HTML>
    <BODY>
        <P>
            The Quail
            is the California
            state bird.
        </P>
        <P>
            The Chickadee
            is the Massachusetts
            state bird.
        </P>
        <P>
            The Bluebird
            is the New York
            state bird.
        </P>
    </BODY>
</HTML>

Figure 9.6 shows these results in Figure 9.6. This is a good example that shows how to extract and work with data from XML documents by using XSLT.

Figure 9.6 Using abbreviated syntax.

While you're discussing built-in abbreviated syntax, it's also worth noting that XSLT also has some built-in default rules, some of which you've already seen in action.

The most important default rule applies to elements, and here's how you might put it in XSLT syntax:

<xsl:template match="/ | *">
    <xsl:apply-templates/>
</xsl:template>

What this means is that if you don't supply a template for an element, that element is still processed with <xsl:apply-templates/> to handle the element's child nodes.

Similarly, the default rule for attributes is to place the value of the attribute in the output document, as in this example:

<xsl:template match="@*">
    <xsl:value-of select="."/>
</xsl:template>

The default rule for text is to just insert the text into the output document. That rule can be expressed like this, where the XPath text() function just returns the text in a text node:

<xsl:template match="text()">
    <xsl:value-of select="."/>
</xsl:template>

However, the content of processing instructions (which may be matched by using the XPath processing-instruction() function) and comments (which may be matched by using the XPath comment() function) is not inserted into the output document by default. You might express their default rules like this:

<xsl:template match="processing-instruction()"/>
<xsl:template match="comment()"/>

In fact, you can create whole style sheets that rely entirely on default rules. Here's what that might look like:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
</xsl:stylesheet>

Here's what you get when you apply this default-rules-only style sheet to ch09_01.xml:

<?xml version="1.0" encoding="UTF-8"?>

    California
    33871648
    Sacramento
    Quail
    Golden Poppy
    155959

    Massachusetts
    6349097
    Boston
    Chickadee
    Mayflower
    7840

    New York
    18976457
    Albany
    Bluebird
    Rose
    47214

Note that just the raw data in the document is transferred to the output document, which is the way things work by default in XSLT.

XPath Tools

There's no question that it can take some time to get used to XPath syntax. Fortunately, there are some good tools out there to help, such as the XPath Visualiser by Dimitre Novatchev, which you can get for free at http://www.vbxml.com/downloads/default.asp?id=visualiser. To use this tool, you just have to browse to the XML document you want to work with and enter the XPath expression you want to check. The XPath Visualiser then highlights in yellow nodes that match your expression. For example, Figure 9.7 shows this tool working on the sample XML document with the XPath expression //*[@units]. This is a great way to test your XPath expressions until you get them to do what you want; all you need in order to use this tool is a browser.

Figure 9.7 Using the XPath Visualiser.

< Back Page 111 of 288 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Privacy Notice

Overview

Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information

Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security

Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children

This site is not directed to children under the age of 13.

Marketing

Pearson may send or direct marketing communications to users, provided that

Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
Such marketing is consistent with applicable law and Pearson's legal obligations.
Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out

Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure

Pearson may disclose personal information, as follows:

As required by law.
With the consent of the individual (or their parent, if the individual is a minor)
In response to a subpoena, court order or legal process, to the extent permitted or required by law
To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
To investigate or address actual or suspected fraud or other illegal activities
To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links

This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact

Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice

We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020

Email Address