- Sams Teach Yourself XML in 21 Days, Third Edition
- Table of Contents
- About the Author
- Acknowledgments
- We Want to Hear from You!
- Introduction
- Part I: At a Glance
- Day 1. Welcome to XML
- All About Markup Languages
- All About XML
- Looking at XML in a Browser
- Working with XML Data Yourself
- Structuring Your Data
- Creating Well-Formed XML Documents
- Creating Valid XML Documents
- How XML Is Used in the Real World
- Online XML Resources
- Summary
- Q&A
- Workshop
- Day 2. Creating XML Documents
- Choosing an XML Editor
- Using XML Browsers
- Using XML Validators
- Creating XML Documents Piece by Piece
- Creating Prologs
- Creating an XML Declaration
- Creating XML Comments
- Creating Processing Instructions
- Creating Tags and Elements
- Creating CDATA Sections
- Handling Entities
- Summary
- Q&A
- Workshop
- Day 3. Creating Well-Formed XML Documents
- What Makes an XML Document Well-Formed?
- Creating an Example XML Document
- Understanding the Well-Formedness Constraints
- Using XML Namespaces
- Understanding XML Infosets
- Understanding Canonical XML
- Summary
- Q&A
- Workshop
- Day 4. Creating Valid XML Documents: DTDs
- All About DTDs
- Validating a Document by Using a DTD
- Creating Element Content Models
- Commenting a DTD
- Supporting External DTDs
- Handling Namespaces in DTDs
- Summary
- Q&A
- Workshop
- Declaring Attributes in DTDs
- Day 5. Handling Attributes and Entities in DTDs
- Specifying Default Values
- Specifying Attribute Types
- Handling Entities
- Summary
- Q&A
- Workshop
- Day 6. Creating Valid XML Documents: XML Schemas
- Using XML Schema Tools
- Creating XML Schemas
- Dissecting an XML Schema
- The Built-in XML Schema Elements
- Creating Elements and Types
- Specifying a Number of Elements
- Specifying Element Default Values
- Creating Attributes
- Summary
- Q&A
- Workshop
- Day 7. Creating Types in XML Schemas
- Restricting Simple Types by Using XML Schema Facets
- Creating XML Schema Choices
- Using Anonymous Type Definitions
- Declaring Empty Elements
- Declaring Mixed-Content Elements
- Grouping Elements Together
- Grouping Attributes Together
- Declaring all Groups
- Handling Namespaces in Schemas
- Annotating an XML Schema
- Summary
- Q&A
- Workshop
- Part I. In Review
- Well-Formed Documents
- Valid Documents
- Part II: At a Glance
- Day 8. Formatting XML by Using Cascading Style Sheets
- Our Sample XML Document
- Introducing CSS
- Connecting CSS Style Sheets and XML Documents
- Creating Style Sheet Selectors
- Using Inline Styles
- Creating Style Rule Specifications in Style Sheets
- Summary
- Q&A
- Workshop
- Day 9. Formatting XML by Using XSLT
- Introducing XSLT
- Transforming XML by Using XSLT
- Writing XSLT Style Sheets
- Using <xsl:apply-templates>
- Using <xsl:value-of> and <xsl:for-each>
- Matching Nodes by Using the match Attribute
- Working with the select Attribute and XPath
- Using <xsl:copy>
- Using <xsl:if>
- Using <xsl:choose>
- Specifying the Output Document Type
- Summary
- Q&A
- Workshop
- Day 10. Working with XSL Formatting Objects
- Introducing XSL-FO
- Using XSL-FO
- Using XSL Formatting Objects and Properties
- Building an XSL-FO Document
- Handling Inline Formatting
- Formatting Lists
- Formatting Tables
- Summary
- Q&A
- Workshop
- Part II. In Review
- Using CSS
- Using XSLT
- Using XSL-FO
- Part III: At a Glance
- Day 11. Extending HTML with XHTML
- Why XHTML?
- Writing XHTML Documents
- Validating XHTML Documents
- The Basic XHTML Elements
- Organizing Text
- Formatting Text
- Selecting Fonts: <font>
- Comments: <!-->
- Summary
- Q&A
- Workshop
- Day 12. Putting XHTML to Work
- Creating Hyperlinks: <a>
- Linking to Other Documents: <link>
- Handling Images: <img>
- Creating Frame Documents: <frameset>
- Creating Frames: <frame>
- Creating Embedded Style Sheets: <style>
- Formatting Tables: <table>
- Creating Table Rows: <tr>
- Formatting Table Headers: <th>
- Formatting Table Data: <td>
- Extending XHTML
- Summary
- Q&A
- Workshop
- Day 13. Creating Graphics and Multimedia: SVG and SMIL
- Introducing SVG
- Creating an SVG Document
- Creating Rectangles
- Adobe's SVG Viewer
- Using CSS Styles
- Creating Circles
- Creating Ellipses
- Creating Lines
- Creating Polylines
- Creating Polygons
- Creating Text
- Creating Gradients
- Creating Paths
- Creating Text Paths
- Creating Groups and Transformations
- Creating Animation
- Creating Links
- Creating Scripts
- Embedding SVG in HTML
- Introducing SMIL
- Summary
- Q&A
- Workshop
- Day 14. Handling XLinks, XPointers, and XForms
- Introducing XLinks
- Beyond Simple XLinks
- Introducing XPointers
- Introducing XBase
- Introducing XForms
- Summary
- Workshop
- Part III. In Review
- Part IV: At a Glance
- Day 15. Using JavaScript and XML
- Introducing the W3C DOM
- Introducing the DOM Objects
- Working with the XML DOM in JavaScript
- Searching for Elements by Name
- Reading Attribute Values
- Getting All XML Data from a Document
- Validating XML Documents by Using DTDs
- Summary
- Q&A
- Workshop
- Day 16. Using Java and .NET: DOM
- Using Java to Read XML Data
- Finding Elements by Name
- Creating an XML Browser by Using Java
- Navigating Through XML Documents
- Writing XML by Using Java
- Summary
- Q&A
- Workshop
- Day 17. Using Java and .NET: SAX
- An Overview of SAX
- Using SAX
- Using SAX to Find Elements by Name
- Creating an XML Browser by Using Java and SAX
- Navigating Through XML Documents by Using SAX
- Writing XML by Using Java and SAX
- Summary
- Q&A
- Workshop
- Day 18. Working with SOAP and RDF
- Introducing SOAP
- A SOAP Example in .NET
- A SOAP Example in Java
- Introducing RDF
- Summary
- Q&A
- Workshop
- Part IV. In Review
- Part V: At a Glance
- Day 19. Handling XML Data Binding
- Introducing DSOs
- Binding HTML Elements to HTML Data
- Binding HTML Elements to XML Data
- Binding HTML Tables to XML Data
- Accessing Individual Data Fields
- Binding HTML Elements to XML Data by Using the XML DSO
- Binding HTML Tables to XML Data by Using the XML DSO
- Searching XML Data by Using a DSO and JavaScript
- Handling Hierarchical XML Data
- Summary
- Q&A
- Workshop
- Day 20. Working with XML and Databases
- XML, Databases, and ASP
- Storing Databases as XML
- Using XPath with a Database
- Introducing XQuery
- Summary
- Q&A
- Workshop
- Day 21. Handling XML in .NET
- Creating and Editing an XML Document in .NET
- From XML to Databases and Back
- Reading and Writing XML in .NET Code
- Using XML Controls to Display Formatted XML
- Creating XML Web Services
- Summary
- Q&A
- Workshop
- Part V. In Review
- Appendix A. Quiz Answers
- Quiz Answers for Day 1
- Quiz Answers for Day 2
- Quiz Answers for Day 3
- Quiz Answers for Day 4
- Quiz Answers for Day 5
- Quiz Answers for Day 6
- Quiz Answers for Day 7
- Quiz Answers for Day 8
- Quiz Answers for Day 9
- Quiz Answers for Day 10
- Quiz Answers for Day 11
- Quiz Answers for Day 12
- Quiz Answers for Day 13
- Quiz Answers for Day 14
- Quiz Answers for Day 15
- Quiz Answers for Day 16
- Quiz Answers for Day 17
- Quiz Answers for Day 18
- Quiz Answers for Day 19
- Quiz Answers for Day 20
- Quiz Answers for Day 21
Handling Entities
An entity in XML is simply a data item. Entities are usually text in common usage, but they can also be binary data. If you want an XML document that uses entities to be valid, you can declare an entity in a DTD and refer to it in the document (for text entities, the entity reference is replaced by the entity itself when parsed by an XML processor).
There are many different ways of dealing with data, so you probably won't be surprised to learn that there are different ways of working with entities. DTDs know about two types of entities: general entities and parameter entities. General entities are for use in the body of XML documents, and parameter entities are for use in a document's DTD. You'll see both today. General entity references start with & and end with ;, and parameter entity references start with % and end with ;.
Entities can also be internal or external. An internal entity is defined completely inside the XML document that uses it. An external entity, on the other hand, is stored externally, such as in a file; to refer to an external entity in XML, you can use a URI.
Here's how it works: You declare an entity in a DTD, and then you can refer to it with an entity reference in the rest of the XML document. In fact, you've already seen that there are five general entity references that are predefined in XML—<, >, &, ", and ', which stand for the characters <, >, &, ", and ', respectively. Because these entities are predefined in XML, you don't need to define them in a DTD; you can just use the entity references; for example, you can see all five predefined entity references at work in Listing 5.3.
Example 5.3. A Sample XML Document That Uses Predefined General Entity References (ch05_03.xml)
<?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE data [ <!ELEMENT data (#PCDATA)> ]> <data> Welcome to Marge & Maggie's XML document! Marge says, "Do you like your <data> element"? </data>
An XML processor will replace each of the entity references in ch05_03.xml with the corresponding character. Figure 5.1 shows the results of Listing 5.3 in Internet Explorer.
Figure 5.1 Using the predefined entity references.
Although the five predefined general entity references are useful when you want to make sure text isn't interpreted as markup, they're very limited. When it is time to create your own entities, it's time to use the <!ENTITY> element, as described in the following section.
Creating Internal General Entity References
In much the same way that you use the <!ELEMENT> element to declare an element in a DTD, you use the <!ENTITY> element to declare an entity. You declare a general entity like this:
<!ENTITY name definition>
In this case, name is the entity's name and definition is its definition. The name of the entity is just the name you want to use to refer to the entity, but an entity's definition can take several different forms. The simplest way of defining an entity is just to use the text that you want XML processors to replace entity references with. For example, here's how you might create a new entity named copyright that will be replaced with the text "(c) XML Power Corp. 2005":
<!ENTITY copyright "(c) XML Power Corp. 2005">
Now when you declare this entity in a DTD and refer to it in your document as ©right;, that entity reference will be replaced with the text "(c) XML Power Corp. 2005". Listing 5.4 shows an example, ch05_04.xml, which declares this entity in the DTD and uses it in the body of the XML document in an element named <copy>. (Note that you also have to declare <copy> in the DTD.)
Example 5.4. Defining a General Entity (ch05_04.xml)
<?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (copy, name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA) > <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ELEMENT copy (#PCDATA)> <!ATTLIST employee supervisor CDATA #IMPLIED> <!ENTITY copyright "(c) XML Power Corp. 2005"> ]> <document> <employee supervisor="no"> <copy>©right;</copy> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee> <employee supervisor="yes"> <copy>©right;</copy> <name> <lastname>Grant</lastname> <firstname>Cary</firstname> </name> <hiredate>October 20, 2005</hiredate> <projects> <project> <product>Desktop</product> <id>333</id> <price>$2995.00</price> </project> <project> <product>Scanner</product> <id>444</id> <price>$200.00</price> </project> </projects> </employee> </document>
Figure 5.2 shows the document ch05_04.xml in Internet Explorer. Note that ©right; has indeed been replaced by the text "(c) XML Power Corp. 2005".
Figure 5.2 Creating and using a user-defined entity.
The replacement text for internal general entity references doesn't have to be quoted text; you can use UTF-8 (or other) character codes directly. For example, here's how to modify the example that uses the predefined general entities quot, amp, lt, and so on by defining your own internal general entities quot2, amp2, lt2, and so on, using UTF-8 character codes:
<?xml version = "1.0" standalone="yes"?> <!DOCTYPE TEXT [ <!ENTITY amp2 "&#38"> <!ENTITY apos2 "'"> <!ENTITY gt2 ">"> <!ENTITY lt2 "&#60"> <!ENTITY quot2 """> <!ELEMENT data (#PCDATA)> ]> <data> Welcome to Marge &2; Maggie&apos2;s XML document! Marge says, "2;Do you like our <2;data>2; element"2;? </data>
This XML gives the same results as the previous example, which simply uses the predefined general entities quot, amp, lt, and so on.
What's happening here is that when you use an entity reference such as >2;, it's replaced with the entity reference > which the XML processor then replaces with ">". Among other things, this indicates that you can nest entity references.
The following is another example, in which the entity reference &me; in the second entity declaration will be replaced with "Ferdinand Magellan" from the first entity declaration:
<!ENTITY me "Ferdinand Magellan"> <!ENTITY copyright "(c) &me; 1519">
Note that although you can nest entity references, they can't be circular, or the XML processor will go nuts. For example, this isn't legal:
<!ENTITY me "©right; Ferdinand Magellan"> <!ENTITY copyright "(c) &me; 1519">
Circular entity references like this one are illegal in valid documents.
General entity references, such as ©right;, are valid only in the body of the XML document, not in the DTD itself. For example, this is not legal:
<!ENTITY employeeContent "(copy, name, hiredate, projects)"> <!ELEMENT employee &employeeContent;>
The way you should handle a situation like this, where an entity reference is used in the DTD itself, is by using parameter entities, which you'll take a look at later today.
Creating External General Entity References
In addition to the internal general entities just described, you can also work with external general entities. In this case, you use a URI to direct the XML processor to the external entity. As you're going to see, you can also indicate that such an entity should not be parsed, which is how to associate binary data with an XML document; it's something like associating images with an HTML document. (Note that even though you don't want the XML processor to parse the external entity, most processors will still check to make sure the external entity exists and is at the URI you've given.)
Just as you can with external DTDs, you can use the SYSTEM keyword or the PUBLIC keyword when declaring external general entities. As with external DTDs, you use SYSTEM when working with an external entity that's private to you or your organization, and you use PUBLIC when you're using an external entity that's public. As with external DTDs, when you use a public external entity, you need to use an FPI when you refer to it. Here's the syntax for declaring an external general entity:
<!ENTITY name SYSTEM URI> <!ENTITY name PUBLIC FPI URI>
For example, you can place the text "(c) XML Power Corp. 2005" for the copyright general entity in the file ch05_05.xml, which appears in Listing 5.5.
Example 5.5. Storing Text as an External General Entity (ch05_05.xml)
<?xml version = "1.0" encoding="UTF-8"?> "(c) XML Power Corp. 2005"
You use the following to create an external general entity reference named copyright that refers to the external document ch05_05.xml:
<!ENTITY copyright SYSTEM "ch05_05.xml">
Now you can use the ©right; external entity reference just as you did before, when it was an internal entity reference. You can see this at work in ch05_06.xml, which is shown in Listing 5.6. (Note that you also have to change the value of the standalone attribute in the XML declaration from "yes" to "no".)
Example 5.6. Using an External General Entity (ch05_06.xml)
<?xml version = "1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (copy, name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ELEMENT copy (#PCDATA)> <!ATTLIST employee supervisor CDATA #IMPLIED> <!ENTITY copyright SYSTEM "ch05_05.xml"> ]> <document> <employee supervisor="no"> <copy>©right;</copy> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee> <employee supervisor="yes"> <copy>©right;</copy> <name> <lastname>Grant</lastname> <firstname>Cary</firstname> </name> <hiredate>October 20, 2005</hiredate> <projects> <project> <product>Desktop</product> <id>333</id> <price>$2995.00</price> </project> <project> <product>Scanner</product> <id>444</id> <price>$200.00</price> </project> </projects> </employee> </document>
If you open this new XML document, ch05_06.xml, in Internet Explorer, you'll see the same results shown in Figure 5.2. The external entity (that is, the text in ch05_05.xml) is picked up, and its text appears in the resulting display.
By using external general entities in this way, you can assemble XML documents together from various pieces stored in their own files. That can be very useful if, for example, you have standard headers or footers or copyright notices that you want to use. Note that if you need to change those items (such as the date in a copyright notice), you need to make your changes only in one file.
Associating Non-XML Data with an XML Document
Earlier in today's discussion, you saw that you can associate non-XML data—an image file, in fact—by using an external entity. You created an entity named PHOTO1221 that referred to an external file named 1221.gif and an attribute of the ENTITY type to which you could assign PHOTO1221:
<!ENTITY PHOTO1221 SYSTEM "1221.gif"> <!ATTLIST employee photo ENTITY #IMPLIED> . . . <employee photo="PHOTO1221">
This associates the image file 1221.gif with the current XML document, but you can make things even clearer to the XML processor. In particular, you can indicate that 1221.gif is an external entity that should not be parsed. That's the way you normally associate binary data with an XML document—by treating it as an unparsed external entity.
To declare an external unparsed entity, you use an <!ENTITY> element with either the SYSTEM keyword or the PUBLIC keyword, like this (note the keyword NDATA, which indicates that you're referring to an unparsed entity):
<!ENTITY name SYSTEM value NDATA type> <!ENTITY name PUBLIC FPI value NDATA type>
Here, name is the name of the external unparsed entity, value is the value of the entity, such as the name of an external file (for example, 1221.gif), and type is a declared notation (which you create by using a <!NOTATION> element). For example, to explicitly indicate that 1221.gif is an external entity that should not be parsed, you can create a notation named gif for GIF files:
<!NOTATION gif SYSTEM "image/gif">
Next, you can declare 1221.gif as an unparsed entity that uses the gif notation:
<!NOTATION gif SYSTEM "image/gif"> <!ENTITY PHOTO1221 SYSTEM "1221.gif" NDATA gif>
And you can create an ENTITY attribute named photo for the <employee> element:
<!NOTATION gif SYSTEM "image/gif"> <!ENTITY PHOTO1221 SYSTEM "1221.gif" NDATA gif> <!ATTLIST employee photo ENTITY #IMPLIED>
Finally, you can assign the photo attribute the value PHOTO1221:
<?xml version = "1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!NOTATION gif SYSTEM "image/gif"> <!ENTITY PHOTO1221 SYSTEM "1221.gif" NDATA gif> <!ATTLIST employee photo ENTITY #IMPLIED> ]> <document> <employee photo="PHOTO1221"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee> </document>
Note that in this example, you do not use an entity reference (that is, &PHOTO1221;) because you do not want the XML processor to parse 1221.gif. Note also that when you use external unparsed entities like this, validating XML processors won't try to read and parse them, but they will usually check to make sure that the entities exist at the URI you specify to ensure that the whole XML document is considered complete.
You can also associate multiple unparsed external entities with an XML document if you create an attribute of the ENTITIES type, like this:
<?xml version = "1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA) > <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!NOTATION gif SYSTEM "image/gif"> <!ENTITY PHOTO1221 SYSTEM "1221.jpg" NDATA gif> <!ENTITY PHOTO1222 SYSTEM "1222.jpg" NDATA gif> <!ATTLIST employee photos ENTITIES #IMPLIED> ]> <document> <employee photo="PHOTO1221 PHOTO1222"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee> </document>
Now that you've discussed general entities, let's take a look at entities that are specially designed to be used in DTDs only: parameter entities.
Creating Internal Parameter Entities
General entities are limited when it comes to working with DTDs. You can declare general entities in DTDs, but you can't use general entity references that the XML processor will expand in a DTD. However, it turns out that it can be useful to use parameters in DTDs, and you use parameter entities and parameter entity references (which can only be used in DTDs) for that. In fact, there's one more restriction on DTDs: Parameter entity references that you use inside an already existing DTD declaration can appear only in the DTD's external subset, which means the part of the DTD that is external. You'll discuss what this means in a few pages.
Unlike entity references, parameter references don't start with &; they start with % instead. Like general entities, you can declare a parameter entity by using the <!ENTITY> element, but you include a % to show that you're declaring a parameter reference. Here's the syntax for declaring an internal parameter entity:
<!ENTITY % name definition>
As you might expect, when you declare an external parameter entity, you can use the SYSTEM and PUBLIC keywords, like this:
<!ENTITY % NAME SYSTEM URI> <!ENTITY % NAME PUBLIC FPI URI>
The following is an example that shows how to use an internal parameter entity. In this case, you just declare the parameter entity project to refer to the standard declaration of the <project> element in the sample XML document:
<!ENTITY % project "<!ELEMENT project (product, id, price)>">
Now when you use the parameter entity reference %project; in the DTD, it will be replaced with the text "<!ELEMENT project (product, id, price)>". Listing 5.7 shows this at work in ch05_07.xml.
Example 5.7. Using an Internal Parameter Entity (ch05_07.xml)
<?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE document [ <!ENTITY % project "<!ELEMENT project (product, id, price)>"> <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> %project; <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ATTLIST employee supervisor CDATA #IMPLIED> ]> <document> <employee supervisor="no"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee> </document>
This turns out to be about as far as you can go with internal parameter entities because you can't use them inside other declarations. To see how parameter entities can really be useful, you have to turn to external parameter entities, which are described in the following section.
Creating External Parameter Entities
When you use a parameter entity in a DTD's external subset, you can refer to that entity anywhere in the DTD, including inside other element declarations. To see an example, you need an XML document that uses an external DTD, like ch05_08.xml, which uses an external DTD named ch05_09.dtd (see Listing 5.8).
Example 5.8. Using External Parameter Entities (ch05_08.xml)
<?xml version = "1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE document SYSTEM "ch05_09.dtd"> <document> <employee supervisor="no"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee> </document>
Let's say that in an external DTD, you want to create three elements that might appear in <employee> elements to record comments about the employee: <supervisorComment>, <customerComment>, and <employeeComment>. All three of these elements have the same content model. Say that each of these elements has the content model (date, text), where <date> contains the date of the comment and <text> contains the text of the comment. You can create a new parameter entity named record for this content model:
<!ENTITY % record "(date, text)">
Now in the external DTD, you can use a reference to this entity when you want to use the content model for the <supervisorComment>, <customerComment>, and <employeeComment> elements:
<!ELEMENT supervisorComment %record;> <!ELEMENT customerComment %record;> <!ELEMENT employeeComment %record;>
That's all it takes; Listing 5.9 shows the entire external DTD, which uses the record parameter entity.
Example 5.9. An External DTD That Uses Parameter Entities (ch05_09.dtd)
<?xml version = "1.0" encoding="UTF-8"?> <!ENTITY % record "(date, text)"> <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects, supervisorComment*, customerComment*, employeeComment*)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT supervisorComment %record;> <!ELEMENT customerComment %record;> <!ELEMENT employeeComment %record;> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ELEMENT date (#PCDATA)> <!ELEMENT text (#PCDATA)> <!ATTLIST employee supervisor CDATA #IMPLIED>
Using parameter entities as in this example can be very useful because it means you can store all the content models you use in one location and change them in that one place as needed rather than having to hunt through an entire document. You can also use parameter DTDs to centralize your attribute declarations in the same way. You can even collect attribute declarations into groups and use them in element declarations as needed. For example, you might decide that a new element named <imager> should support both hyperlink attributes (such as a targetURI attribute) and image attributes (such as an imageURI attribute), and if you've grouped your attributes by functionality, here's how you could add those attributes to this element:
<!ATTLIST imager %hyperlink_attributes; %image_attributes;>
Using INCLUDE and IGNORE to Parameterize DTDs
There are two important directives that you need to know about when it comes to working with DTDs: INCLUDE and IGNORE. Directives are special commands to the XML processor, and INCLUDE and IGNORE are specially designed to customize a DTD by including or omitting sections of that DTD. The following is the syntax of INCLUDE and IGNORE:
<![ INCLUDE [DTD Section]]> and <![ IGNORE [DTD Section]]>
Here are two examples of what these directives might look like in action:
<![ INCLUDE [ <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> ]]> <![ IGNORE [ <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> ]]>
In the first of these examples, the contained DTD fragment will be included by the XML processor, and in the second example, the contained DTD fragment will be ignored.
So why are INCLUDE and IGNORE useful? Can't you just include or ignore sections of DTDs ourselves, by adding or deleting them as needed? Can't you just use standard XML comments to hide sections of DTDs if you need to? Yes, you can. The reason you see INCLUDE and IGNORE in DTDs is that by using these directives, you can create parameterized DTDs. DTDs can be dozens of pages long (like the ones for XHTML), and you might miss some sections you want to exclude if you just rely on XML comments. But when you parameterize a DTD, you can just set a parameter entity to "INCLUDE" or "IGNORE" to include or ignore many DTDs sections at once.
Let's use the DTD for XHTML 1.1, which is a parameterized DTD, as an example. The main DTD for XHTML 1.1 is set up to include or ignore other sections of the DTD (a DTD that works like this is sometimes called a DTD driver), depending on how you want to customize the DTD. For example, some devices that can support some XHTML can't support everything. Cell phones might be fine with bold text and hyperlinks but might have trouble displaying tables, for instance. For that reason, you can customize the XHTML 1.1 DTD to include or ignore the DTD section that has to do with tables. In particular, the XHTML 1.1 DTD declares a parameter entity named xhtml-table.module that is set to "INCLUDE" by default and includes the table DTD module with an INCLUDE section, like this:
<!ENTITY % xhtml-table.module "INCLUDE" > . . . <![%xhtml-table.module;[ <!ENTITY % xhtml-table.mod PUBLIC "-//W3C//ELEMENTS XHTML 1.1 Tables 1.0//EN" "xhtml11-table-1.mod" > %xhtml-table.mod;]]> ]]>
If you wanted to, you could exclude all reference to XHTML tables in your own version of the XHTML 1.1 DTD just by setting xhtml-table.module to "IGNORE" to exclude support for tables. In this way, you can centralize control over a parameterized DTD, which might be dozens of pages long, simply by changing the values of a few parameter entities in one location. The XHTML 1.1 DTD is written in modules that can be expressly included or ignored if you want, making the entire XHTML 1.1 DTD fully parameterized.