Using XML Namespaces
There's a lot of freedom in XML, because you get to create your own markup. As time went on, however, XML authors started noticing a problem that the original creators of XML hadn't really anticipatedconflicting tag names.
For example, you've already seen that two popular XML applications are XHTML, which is the derivation of HTML in XML, and MathML, which lets you format and display math equations. Suppose that you want to display an equation in an XHTML Web page. That could be a problem, because because the tag set in XHTML and MathML overlapin particular, each XML application defines a <var> and <select> element.
The way to solve this problem is to use namespaces. Namespaces give you a way to make sure that one set of tags will not conflict with another. You prefix a name to tag and attribute names. Changing the resulting names won't conflict with others that have a different prefix.
XML namespaces are one of those XML companion recommendations that keep being added to the XML specification. You can find the specification for namespaces at http://www.w3.org/TR/REC-xml-names/. There's still a lot of debate about this one (mostly because namespaces can make writing DTDs difficult), but it's an official W3C recommendation now.
Creating Namespaces
An example will make namespaces and why they're important clearer. For example, suppose you're the boss of one of the employees in our sample document, ch03_01.xml:
<employee> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee>
Now suppose that you want to add your own comments to this employee's data in a <comment> element. The problem with that is that the XML data on this employee comes from the Human Resources department, and they haven't created an element named <comment>. You can indeed create your own <comment> element, but first you should confine the human resource's department's XML data to its own namespace to indicate that your comments are not part of the Human Resource Department's set of XML tags.
To define a new namespace, use the xmlns:prefix attribute, where prefix is the prefix you want to use for the namespace. In this case, you'll define a new namespace called hr for the Human Resources department:
<employee>
xmlns:hr="http://www.superduperbigco.com/human_resources">
<name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee>
To define a namespace, you assign the xmlns:prefix attribute to a unique identifier, which in XML is usually a URI that might direct the XML processor to a DTD for the namespace (but doesn't have to). So what's a URI?
Defining Namespaces with URIs
The XML specification expands the idea of standard URLs (Uniform Resource Locators) into URIs (Uniform Resource Identifiers). In HTML and on the Web, you use URLs; in XML, you use URIs. URIs are supposed to be more general than URLs, as we'll see when we discuss XLinks and XPointers in Day 14, "Handling XLinks, XPointers, and XForms."
For example, in theory, a URI can point not just to a single resource, but to a cluster of resources, or to arcs of resources along a path. The truth is that the whole idea of URIs as the next step after URLs is still being developed, and in practice, URLs are almost invariably used in XMLbut you still call them URIs. Some software accepts more general forms of URIs, letting you, for example, access only a specific section of an XML document, but such usage and the associated syntax is far from standardized yet.
TIP
You might want to look up the current formal definition of URIs, which you can find in its entirety at http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt.
When you define a namespace with the xmlns:prefix attribute, you usually assign a URI to that attribute (in practice, this URI is always a URL today). The document that URI points to can describe more about the namespace you're creating; an example of this is the XHTML namespace, which uses the namespace http://www.w3.org/1999/xhtml
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/tr/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns:xhtml="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
. . .
A namespace's URI can also hold a DTD or XML schema that defines the syntax for the XML elements you can use in that namespace (then it's up to the XML processor to use that DTD or XML schema, if it's been written to be smart enough to interpret namespaces in this waymost aren't). All that's really necessary, however, is that you assign a unique identifier, which can be any text, to the xmlns:prefix attribute.
After defining the hr namespace in our example, you can preface every tag and attribute name in this namespace with hr: like this:
<hr:employee
xmlns:hr="http://www.superduperbigco.com/human_resources">
<hr:name> <hr:lastname>Kelly</hr:lastname> <hr:firstname>Grace</hr:firstname> </hr:name> <hr:hiredate>October 15, 2005</hr:hiredate> <hr:projects> <hr:project> <hr:product>Printer</hr:product> <hr:id>111</hr:id> <hr:price>$111.00</hr:price> </hr:project> <hr:project> <hr:product>Laptop</hr:product> <hr:id>222</hr:id> <hr:price>$989.00</hr:price> </hr:project> </hr:projects> </hr:employee>
Now you've made it clear that all these tags come from the Human Resources department. Note how this worksthe actual tag names themselves have been changed, because a colon is a legal character to use in tag names. (Now you know why you shouldn't use colons in tag names, although they're legalthey can make it look like you're using namespaces when you're not.) For example, the <product> tag has now become the <hr:product> tag. In other words, using namespaces keeps elements separate by actually changing tag and attribute names. This was a clever solution to the problem of tag and attribute name conflicts, because this way, even XML processors that have never heard of namespaces can still "support" them.
At this point, all tag and attribute names from the hr namespace are in their own namespace, so you can add your own namespace to the document, allowing you to use your own elements without fear of conflict. Since you're the boss, you might start by defining a new namespace named boss:
<hr:employee xmlns:hr="http://www.superduperbigco.com/human_resources"
xmlns:boss="http://www.superduperbigco.com/big_boss">
<hr:name> <hr:lastname>Kelly</hr:lastname> <hr:firstname>Grace</hr:firstname> </hr:name> <hr:hiredate>October 15, 2005</hr:hiredate> <hr:projects> <hr:project> <hr:product>Printer</hr:product> <hr:id>111</hr:id> <hr:price>$111.00</hr:price> </hr:project> <hr:project> <hr:product>Laptop</hr:product> <hr:id>222</hr:id> <hr:price>$989.00</hr:price> </hr:project> </hr:projects> </hr:employee>
Now you can use the new boss namespace to add your own markup to the document, as you see in Listing 3.2.
Listing 3.2 XML Document with Namespaces (ch03_02.xml)
<hr:employee xmlns:hr="http://www.superduperbigco.com/human_resources" xmlns:boss="http://www.superduperbigco.com/big_boss"> <hr:name> <hr:lastname>Kelly</hr:lastname> <hr:firstname>Grace</hr:firstname> </hr:name> <hr:hiredate>October 15, 2005</hr:hiredate>
<boss:comment>Needs much supervision.</boss:comment>
<hr:projects> <hr:project> <hr:product>Printer</hr:product> <hr:id>111</hr:id> <hr:price>$111.00</hr:price> </hr:project> <hr:project> <hr:product>Laptop</hr:product> <hr:id>222</hr:id> <hr:price>$989.00</hr:price> </hr:project> </hr:projects> </hr:employee>
You can also add your own attributes in the boss namespace as long as you prefix them with boss: this way:
<hr:employee> xmlns:hr="http://www.superduperbigco.com/human_resources" xmlns:boss="http://www.superduperbigco.com/big_boss"> <hr:name> <hr:lastname>Kelly</hr:lastname> <hr:firstname>Grace</hr:firstname> </hr:name> <hr:hiredate>October 15, 2005</hr:hiredate>
<boss:comment boss:date="10/15/2006">
Needs much supervision.
</boss:comment>
<hr:projects> <hr:project> <hr:product>Printer</hr:product> <hr:id>111</hr:id> <hr:price>$111.00</hr:price> </hr:project> <hr:project> <hr:product>Laptop</hr:product> <hr:id>222</hr:id> <hr:price>$989.00</hr:price> </hr:project> </hr:projects> </hr:employee>
And that's how namespaces workyou can use them to separate tags, even tags with the same name, so there's no conflict. As you can see, using multiple namespaces in the same document is no problem at alljust use the xmlns:prefix attribute in the enclosing element to define the appropriate namespace. In fact, you can use this attribute attribute in child elements to redefine an enclosing namespace, if you want to.
Namespace prefixes are really just text prefixed to (prepended is the offical term) tag and attribute names. They follow the same rules for naming tags and attributes. For example, in XML 1.0, a namespace name can start with a letter or an underscore. The following characters can include underscores, letters, digits, hyphens, and periods. Note also that although colons are legal in tag names, you can't use a colon in a namespace name, for obvious reasons. Also, there are two namespace names that are reserved: xml and xmlns.
Creating Local Namespaces
The xmlns:prefix attribute can be used in any element, not just the document element. Just bear in mind that this attribute defines a namespace for the current element and any enclosed element, which means you shouldn't use the namespace prefix until you've defined the namespace with an attribute like xmlns:prefix.
For example, you can create the boss: namespace prefix and use it in the same element, as you see in Listing 3.3.
Listing 3.3 XML Document with a Local Namespaces (ch03_03.xml)
<hr:employee xmlns:hr="http://www.superduperbigco.com/human_resources"> <hr:name> <hr:lastname>Kelly</hr:lastname> <hr:firstname>Grace</hr:firstname> </hr:name> <hr:hiredate>October 15, 2005</hr:hiredate>
<boss:comment
xmlns:boss="http://www.superduperbigco.com/big_boss"
boss:date="10/15/2006">
Needs much supervision.
</boss:comment>
<hr:projects> <hr:project> <hr:product>Printer</hr:product> <hr:id>111</hr:id> <hr:price>$111.00</hr:price> </hr:project> <hr:project> <hr:product>Laptop</hr:product> <hr:id>222</hr:id> <hr:price>$989.00</hr:price> </hr:project> </hr:projects> </hr:employee>
You can see ch03_03.xml in the Internet Explorer, complete with namespaces, in Figure 3.1.
Figure 3.1 Viewing an XML document with local namespaces.
Creating Default Namespaces
You can use the xmlns:prefix attribute to define a namespace, or you can use the xmlns attribute by itself to define a default namespace. When you define a default namespace, elements and attributes without a namespace prefix are in that default namespace.
To see how this works, we'll come full circle and put to work the example that introduced our discussion of namespaces in the first placemixing XHTML with MathML. We'll start with some XHTML (all the details on XHTML are coming up in Day 11, "Extending HTML with XHTML," and Day 12, "Putting XHTML to Work"), like this:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/tr/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using XHTML and MathML Together </title> </head> <body> <center> <h1> Using XHTML and MathML Together </h1> </center> <br/> Consider the equation . . . </body> </html>
You'll see what you need to create XHTML documents like this, such as the <!DOCTYPE> element, in Day 11. Note in particular here that in the <html> element, the xmlns attribute defines a default namespace for the <html> and all enclosed elements. (This namespace is the XHTML namespace, which W3C defines as "http://www.w3.org/1999/xhtml".) When you use the xmlns attribute alone this way, without specifying any prefix, you are defining a default namespace. The current element and all child elements are assumed to belong to that namespace. Making use of a default namespace in this way, you can use the standard XHTML tag names without any prefix, as you see here.
However, we also want to use MathML markup in this document, and to do that, we add a new namespace, named m to this document, using the namespace W3C has specified for MathML, "http://www.w3.org/1998/Math/MathML":
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/tr/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"
xmlns:m="http://www.w3.org/1998/Math/MathML">
<head> <title> Using XHTML and MathML Together </title> </head> <body> <center> <h1> Using XHTML and MathML Together </h1> </center> <br/> Consider the equation . . . </body> </html>
Now you can use MathML as you like, as long as you prefix it with the m namespace. You can see this at work in ch03_04.html (XHTML documents use the extension .html), shown in Listing 3.4, where we're using the MathML we developed in Day 1 to display an equation.
Listing 3.4 An XML Document Combining XHTML and MathML (ch03_04.html)
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/tr/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" xmlns:m="http://www.w3.org/1998/Math/MathML"> <head> <title> Using XHTML and MathML Together </title> </head> <body> <center> <h1> Using XHTML and MathML Together </h1> </center> <br/> Consider the equation
<m:math>
<m:mrow>
<m:mrow>
<m:mn>4</m:mn>
<m:mo>⁢</m:mo>
<m:msup>
<m:mi>x</m:mi>
<m:mn>2</m:mn>
</m:msup>
<m:mo>-</m:mo>
<m:mrow>
<m:mn>5</m:mn>
<m:mo>⁢</m:mo>
<m:mi>x</m:mi>
</m:mrow>
<m:mo>+</m:mo>
<m:mn>6</m:mn>
</m:mrow>
<m:mo>=</m:mo>
<m:mn>0.</m:mn>
</m:mrow>
</m:math>
<br/> What, you may ask, are this equation's roots? </body> </html>
Thanks to namespaces, this XHTML/MathML document works just as it should, as you can see in the W3C Amaya browser in Figure 3.2.
Figure 3.2 Viewing an XML document with local namespaces.
You'll be seeing XML namespaces throughout this book, especially when we use the popular XML applications available, such as XHTML.
That finishes the main topics for today's discussionwell-formed documents and namespaces. Before getting into validation in tomorrow's work, however, we'll round off our discussion of XML documents by taking a look at XML infosets and canonical XML. These two topics are worth discussing before we start talking about validation, because they're terms you'll run across as you work with XML, but we're going to consider them optional topicsif you want to skip them and get directly to DTDs, just turn to Day 4.