XML and Namespaces
- Why Namespaces Are Needed: Resolving Name Conflicts
- Qualified Names, Prefixes, Local Names, and Other Terminology
- Declaring Namespaces in XML Documents
- Default Namespace
- Handling Namespaces in a DTD or XML Schema
- Validating Documents with Namespaces
- What Does a Namespace Point To?
- Namespace Support and Use
- Special Attributes: xmlns, xml:space, xml:lang, and xml:base
- Common Namespaces
- Summary
- For Further Exploration
Briefly stated, the primary purpose of XML namespaces is to define a mechanism for uniquely naming elements and attributes so that different vocabularies can be mixed in an XML document without name conflicts. For example, if you want to refer to both the price element defined in the SuperDuperCatalog.dtd and also the price element from the DiscountHouseCatalog.dtd, you need a way to unambiguously identify which type of price you are referring to at any point in an XML instance that references both DTDs. This name collision is a potential problem when you consider that the content models and/or attribute lists for these elements may differ. Consider for example these fragments:
<!-- from SuperDuperCatalog.dtd --> <!ELEMENT price (#PCDATA) > <!ATTLIST price currency CDATA "US" > <!-- from DiscountHouseCatalog.dtd --> <!ELEMENT price (#PCDATA) > <!ATTLIST price type (wholesale | retail) #REQUIRED >
Although these two have the same content model, their attributes differ, so what is valid for one is invalid for the other, not to mention the fact that the two price elements may be based on completely different factors. (In fact, if we were using XML Schema, the two elements could even be different datatypes.) So we clearly need a way to differentiate which price element we mean at all times.
The Namespaces in XML Recommendation (http://www.w3.org/TR/REC-xml-names/) wasn't published until January 1999, nearly a full year after the XML 1.0 Recommendation. Therefore, although the concept of namespaces is now considered part of the core XML technology, you won't see them mentioned in the original XML 1.0 Recommendation or in any of the W3C specifications that appeared in 1998. You will, however, find a few references to namespaces in the XML 1.0 Recommendation, Second Edition, from October 2000 (http://www.w3.org/TR/1998/REC-xml-19980210 and http://www.w3.org/TR/REC-xml, respectively). Virtually every major W3C specification from 1999 onwards contains some mention of the role played by namespaces, or at least states the namespaces that apply to that spec. Namespaces aren't fully supported by some tools (especially those that handle DTDs but not XML Schema), but they are very important to XSLT, XML Schema, XLink, and most of the more recent XML family of specifications.
NOTE
Even if you don't plan to create documents that use mixed vocabularies, you need to understand what XML namespaces are and how they are used because you will encounter them in XSLT, XML Schema, and XLink, as well as in programming APIs such as DOM Level 2 and SAX2.
Although the Namespaces in XML specification itself is only fourteen pages, it has managed to generate far more pages of controversy since its publication. For example, Ronald Bourret's excellent XML Namespace FAQ (http://www.rpbourret.com/xml/NamespacesFAQ.htm.) is forty-six pages, more than three times the length of the W3C recommendation on which it is based. How can this be possible? In his XML.com article, "Namespace Myths Exploded" (http://www.xml.com/pub/a/2000/03/08/namespaces/), Bourret points out that the actual Namespace in XML Recommendation omits many details about issues that programmers have raised, although the specification does achieve its stated purpose of defining a two-part naming scheme for elements and attributes. In his FAQ, Bourret also contends that namespaces do not themselves provide a technology for merging documents that reference different DTDs, although they are useful in developing this capability. Furthermore, the URIs1 used as XML namespace names need not point to anything at all since they are merely intended to be unique identifiers, a distinction that often confuses newcomers. Another point of confusion and controversy is the problem namespaces pose for validation based on DTDs, which are to some degree incompatible with the Namespace in XML Recommendation.
Why is there so much controversy? And what are namespaces anyway? Why do we need them, and how do we use them? This chapter addresses these questions. Readers interested in many more subtle issues concerning namespaces should refer to Bourret's two previously cited resources.
Why Namespaces Are Needed: Resolving Name Conflicts
The element name title appears in many vocabularies: XHTML, SVG, XLink (as both an element and an attribute), XSLFO, Schematron, RSS, and Dublin Core (as Title). The familiar table element from XHTML also appears in XSLFO (and as mtable in MathML2). Both SVG and MathML define a set element. Both SVG and XSLT have a text element. (You can check for element names, attribute names, function names, keywords, and much more using the Smart Reference Search of XML specifications on ZVON.org at http://zvon.org/php/Search/codes.php.)
What happens when we need to refer to identically named elements from different XML languages from within the same XML document? This need can arise when we are combining W3C languages or, as in the example at the beginning of this chapter, referencing multiple custom languages that include common element names like price and item. It is also necessary if you wish to convert between two versions of the same language, such as when translating an XSLT stylesheet originally written for the early MSXML parser from Microsoft to a stylesheet compatible with the XSLT 1.0 Recommendation, or eventually when converting from a XSLT 1.0 stylesheet to a 2.0 stylesheet. We need a way to unambiguously differentiate "the element named title from the XLink namespace" from "the element title from the SVG namespace" from "the element title from the XSLFO namespace" from "the element title from the XHTML namespace."
One way to achieve unique naming is to leverage the uniqueness provided by URIs. Since the Domain Name System guarantees unique names, identifiers based on unique addresses are also unique. That is, if we combine an element name with a URI in some manner, an element with that same name but combined with a different URI will not be considered identical by parsers. For example, we could consider four references to a unique title element, each from a separate namespace, as indicated by a URI for each language (thinking of each language as defining a separate namespace):
{http://www.w3.org/1999/xlink}title {http://www.w3.org/2000/svg}title {http://www.w3.org/1999/XSL/Format}title {http://www.w3.org/1999/xhtml}title
If the languages are created and maintained by different organizations, this scheme still works fine:
{http://www.EverythingUNeed.com/2002/SuperDuperCatalog}price {http://www.HouseOfDiscounts.com/namespaces/Discounts}price
While this URI-based naming approach solves the problem of possible name collision, it certainly makes for ungainly element names. As it turns out, the syntax shown here is conceptual, rather than literal, since slash and curly braces are not legal characters in XML Names.3 So, how do we really declare and use these unique names?