- Sams Teach Yourself XML in 21 Days, Third Edition
- Table of Contents
- About the Author
- Acknowledgments
- We Want to Hear from You!
- Introduction
- Part I: At a Glance
- Day 1. Welcome to XML
- All About Markup Languages
- All About XML
- Looking at XML in a Browser
- Working with XML Data Yourself
- Structuring Your Data
- Creating Well-Formed XML Documents
- Creating Valid XML Documents
- How XML Is Used in the Real World
- Online XML Resources
- Summary
- Q&A
- Workshop
- Day 2. Creating XML Documents
- Choosing an XML Editor
- Using XML Browsers
- Using XML Validators
- Creating XML Documents Piece by Piece
- Creating Prologs
- Creating an XML Declaration
- Creating XML Comments
- Creating Processing Instructions
- Creating Tags and Elements
- Creating CDATA Sections
- Handling Entities
- Summary
- Q&A
- Workshop
- Day 3. Creating Well-Formed XML Documents
- What Makes an XML Document Well-Formed?
- Creating an Example XML Document
- Understanding the Well-Formedness Constraints
- Using XML Namespaces
- Understanding XML Infosets
- Understanding Canonical XML
- Summary
- Q&A
- Workshop
- Day 4. Creating Valid XML Documents: DTDs
- All About DTDs
- Validating a Document by Using a DTD
- Creating Element Content Models
- Commenting a DTD
- Supporting External DTDs
- Handling Namespaces in DTDs
- Summary
- Q&A
- Workshop
- Declaring Attributes in DTDs
- Day 5. Handling Attributes and Entities in DTDs
- Specifying Default Values
- Specifying Attribute Types
- Handling Entities
- Summary
- Q&A
- Workshop
- Day 6. Creating Valid XML Documents: XML Schemas
- Using XML Schema Tools
- Creating XML Schemas
- Dissecting an XML Schema
- The Built-in XML Schema Elements
- Creating Elements and Types
- Specifying a Number of Elements
- Specifying Element Default Values
- Creating Attributes
- Summary
- Q&A
- Workshop
- Day 7. Creating Types in XML Schemas
- Restricting Simple Types by Using XML Schema Facets
- Creating XML Schema Choices
- Using Anonymous Type Definitions
- Declaring Empty Elements
- Declaring Mixed-Content Elements
- Grouping Elements Together
- Grouping Attributes Together
- Declaring all Groups
- Handling Namespaces in Schemas
- Annotating an XML Schema
- Summary
- Q&A
- Workshop
- Part I. In Review
- Well-Formed Documents
- Valid Documents
- Part II: At a Glance
- Day 8. Formatting XML by Using Cascading Style Sheets
- Our Sample XML Document
- Introducing CSS
- Connecting CSS Style Sheets and XML Documents
- Creating Style Sheet Selectors
- Using Inline Styles
- Creating Style Rule Specifications in Style Sheets
- Summary
- Q&A
- Workshop
- Day 9. Formatting XML by Using XSLT
- Introducing XSLT
- Transforming XML by Using XSLT
- Writing XSLT Style Sheets
- Using <xsl:apply-templates>
- Using <xsl:value-of> and <xsl:for-each>
- Matching Nodes by Using the match Attribute
- Working with the select Attribute and XPath
- Using <xsl:copy>
- Using <xsl:if>
- Using <xsl:choose>
- Specifying the Output Document Type
- Summary
- Q&A
- Workshop
- Day 10. Working with XSL Formatting Objects
- Introducing XSL-FO
- Using XSL-FO
- Using XSL Formatting Objects and Properties
- Building an XSL-FO Document
- Handling Inline Formatting
- Formatting Lists
- Formatting Tables
- Summary
- Q&A
- Workshop
- Part II. In Review
- Using CSS
- Using XSLT
- Using XSL-FO
- Part III: At a Glance
- Day 11. Extending HTML with XHTML
- Why XHTML?
- Writing XHTML Documents
- Validating XHTML Documents
- The Basic XHTML Elements
- Organizing Text
- Formatting Text
- Selecting Fonts: <font>
- Comments: <!-->
- Summary
- Q&A
- Workshop
- Day 12. Putting XHTML to Work
- Creating Hyperlinks: <a>
- Linking to Other Documents: <link>
- Handling Images: <img>
- Creating Frame Documents: <frameset>
- Creating Frames: <frame>
- Creating Embedded Style Sheets: <style>
- Formatting Tables: <table>
- Creating Table Rows: <tr>
- Formatting Table Headers: <th>
- Formatting Table Data: <td>
- Extending XHTML
- Summary
- Q&A
- Workshop
- Day 13. Creating Graphics and Multimedia: SVG and SMIL
- Introducing SVG
- Creating an SVG Document
- Creating Rectangles
- Adobe's SVG Viewer
- Using CSS Styles
- Creating Circles
- Creating Ellipses
- Creating Lines
- Creating Polylines
- Creating Polygons
- Creating Text
- Creating Gradients
- Creating Paths
- Creating Text Paths
- Creating Groups and Transformations
- Creating Animation
- Creating Links
- Creating Scripts
- Embedding SVG in HTML
- Introducing SMIL
- Summary
- Q&A
- Workshop
- Day 14. Handling XLinks, XPointers, and XForms
- Introducing XLinks
- Beyond Simple XLinks
- Introducing XPointers
- Introducing XBase
- Introducing XForms
- Summary
- Workshop
- Part III. In Review
- Part IV: At a Glance
- Day 15. Using JavaScript and XML
- Introducing the W3C DOM
- Introducing the DOM Objects
- Working with the XML DOM in JavaScript
- Searching for Elements by Name
- Reading Attribute Values
- Getting All XML Data from a Document
- Validating XML Documents by Using DTDs
- Summary
- Q&A
- Workshop
- Day 16. Using Java and .NET: DOM
- Using Java to Read XML Data
- Finding Elements by Name
- Creating an XML Browser by Using Java
- Navigating Through XML Documents
- Writing XML by Using Java
- Summary
- Q&A
- Workshop
- Day 17. Using Java and .NET: SAX
- An Overview of SAX
- Using SAX
- Using SAX to Find Elements by Name
- Creating an XML Browser by Using Java and SAX
- Navigating Through XML Documents by Using SAX
- Writing XML by Using Java and SAX
- Summary
- Q&A
- Workshop
- Day 18. Working with SOAP and RDF
- Introducing SOAP
- A SOAP Example in .NET
- A SOAP Example in Java
- Introducing RDF
- Summary
- Q&A
- Workshop
- Part IV. In Review
- Part V: At a Glance
- Day 19. Handling XML Data Binding
- Introducing DSOs
- Binding HTML Elements to HTML Data
- Binding HTML Elements to XML Data
- Binding HTML Tables to XML Data
- Accessing Individual Data Fields
- Binding HTML Elements to XML Data by Using the XML DSO
- Binding HTML Tables to XML Data by Using the XML DSO
- Searching XML Data by Using a DSO and JavaScript
- Handling Hierarchical XML Data
- Summary
- Q&A
- Workshop
- Day 20. Working with XML and Databases
- XML, Databases, and ASP
- Storing Databases as XML
- Using XPath with a Database
- Introducing XQuery
- Summary
- Q&A
- Workshop
- Day 21. Handling XML in .NET
- Creating and Editing an XML Document in .NET
- From XML to Databases and Back
- Reading and Writing XML in .NET Code
- Using XML Controls to Display Formatted XML
- Creating XML Web Services
- Summary
- Q&A
- Workshop
- Part V. In Review
- Appendix A. Quiz Answers
- Quiz Answers for Day 1
- Quiz Answers for Day 2
- Quiz Answers for Day 3
- Quiz Answers for Day 4
- Quiz Answers for Day 5
- Quiz Answers for Day 6
- Quiz Answers for Day 7
- Quiz Answers for Day 8
- Quiz Answers for Day 9
- Quiz Answers for Day 10
- Quiz Answers for Day 11
- Quiz Answers for Day 12
- Quiz Answers for Day 13
- Quiz Answers for Day 14
- Quiz Answers for Day 15
- Quiz Answers for Day 16
- Quiz Answers for Day 17
- Quiz Answers for Day 18
- Quiz Answers for Day 19
- Quiz Answers for Day 20
- Quiz Answers for Day 21
Specifying Attribute Types
Although CDATA is the most common attribute type, DTDs support other types as well. These types are not specific enough to let you declare, say, the format of numbers (such as integer, floating point, and so on—which you would be able to declare in XML schemas), but they do let you check the syntax of XML documents to some extent. The following sections describe some of the attribute type possibilities.
The CDATA Attribute Type
As you've already seen, the CDATA data type stands for character data. Unlike parsed character data (PCDATA), which is assumed to have already been parsed, the character data in attribute values is read and parsed by the XML processor. Among other things, that means that you should avoid using the characters <, ", and & in CDATA attribute values because those characters look like markup. If you want to use those characters, you should use their predefined entity references (<, ", and &) instead because these entity references will be parsed and replaced with the corresponding characters.
You've already been using CDATA attributes, the most basic type of attributes, in examples, such as this one:
<?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ATTLIST employee supervisor CDATA #IMPLIED> ]> <document> <employee supervisor="no"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> . . . </document>
The CDATA type is the most general type of attribute. From this point on, however, you'll get into increasingly more specific types.
Enumerated Types
An attribute enumeration is just a list of possible values that an attribute can take. Each possible value must be a valid XML name. In the following example, the supervisor attribute has two possible values—"yes" and "no"—and a default value of "no":
<?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ATTLIST employee supervisor (yes | no) "no"> ]> <document> <employee supervisor="no"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> </employee> . . . </document>
Using an enumeration is a good choice if you want to restrict an attribute to a set of allowed values. For example, if you have an attribute named month, you might want to allow only values such as "January", "February", "March", "April", and so on.
The NMTOKEN Attribute Type
The attribute type NMTOKEN stands for name token, and it lets you assign to an attribute any value made up of legal XML name characters. Attributes of this type can only take values that are made up of characters that can be used in legal XML names (this excludes the restrictions that beginning characters in names must obey, such as no numbers, periods, and so on). For example, in XML 1.0, NMTOKEN characters are letters, digits, hyphens, underscores, colons, and periods. (Note that NMTOKEN characters cannot include whitespace.) In XML 1.1, the characters are the same as in XML 1.0, except for the differences in the characters that are considered legal, as discussed on Day 2, "Creating XML Documents."
In other words, the idea behind the NMTOKEN type is to let you use any standard nonwhitespace character in attributes. The following example adds a state attribute of the NMTOKEN type to hold a two-letter state abbreviation:
<?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ATTLIST employee state NMTOKEN #REQUIRED> ]> <document> <employee state="NY"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> . . . </employee> </document>
The NMTOKENS Attribute Type
The preceding section describes the NMTOKEN attribute type—so what's NMTOKENS? You can use the NMTOKENS attribute type when you want to list multiple values made up of NMTOKEN values, separated by whitespace. The following example allows whitespace in attribute values because you want to store the first and last names of supervisors, making supervisors a required NMTOKENS attribute:
?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ATTLIST employee supervisor NMTOKENS #REQUIRED> ]> <document> <employee supervisor="Tom Brown"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> . . . </employee> </document>
The ID Attribute Type
An important attribute type is the ID type. There's a special meaning to an element's ID value because sometimes XML processors use an ID attribute to identify an element. (They don't have to, but some XML processors pass on ID values of XML elements to underlying software.) Therefore, XML processors are supposed to make sure that no two elements have the same value for the attribute that is of the type ID in a document; in addition, you can give an element only one attribute of this type.
The value you assign to an attribute of the ID type must be a proper XML name. The following example adds an ID attribute to a DTD:
<?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ATTLIST employee id ID #REQUIRED> ]> <document> <employee id="A1112"> . . . </employee> <employee id="A1114"> . . . </employee> <employee id="A1115"> . . </employee> </document>
You can give ID attributes default values of #REQUIRED or #IMPLIED, but note that you wouldn't usually use explicit default values or a #FIXED value because each ID attribute must have a unique value.
The IDREF Attribute Type
DTDs let you do more than specify ID values by using attributes. We can also use IDREF (which stands for ID reference) attributes to tie an element to another element, using the other element's ID value as a reference. For example, if we wanted to store genealogical data in an XML document, we could store a child's data by using an IDREF attribute to hold the ID value of a parent's data.
The following example gives each employee an id attribute and also creates an optional supervisor attribute of type IDREF, which will store the ID value of an employee's supervisor:
<?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ATTLIST employee id ID #REQUIRED supervisor IDREF #IMPLIED> ]> <document> <employee id="A1112" supervisor="A1114"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> . . . </employee> <employee id="A1114"> <name> <lastname>Grant</lastname> <firstname>Cary</firstname> </name> <hiredate>October 20, 2005</hiredate> . . . </employee> </document>
Note that attributes of ID and IDREF are allowed in XML, but they don't have any more special meaning than is discussed here. If you want to do more with these attributes, it's up to you to create or use an XML processor that can handle ID and IDREF data as you want it handled.
The ENTITY Attribute Type
The ENTITY type lets you assign to an attribute the name of an entity you've declared. Later on today we'll talk about how to handle entities; the idea is that we can handle data, such as an external image file, in an XML document by using the <!ENTITY> element. The following example gives the entity name PHOTO1221 to the image file 1221.gif and the entity name PHOTO1222 to the image file 1222.gif:
<!ENTITY PHOTO1221 SYSTEM "1221.gif"> <!ENTITY PHOTO1222 SYSTEM "1222.gif">
Now you can use these entity names, PHOTO1221 and PHOTO1222, as attribute values in attributes of type ENTITY. For example, if 1221.gif and 1222.gif held the photos of various employees, you could indicate that this is the case by using an ENTITY attribute named photo, like this (note that you don't have to use ENTITY attributes to do this—you could just set a CDATA attribute to 1221.gif, for example):
<?xml version = "1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ENTITY PHOTO1221 SYSTEM "1221.gif"> <!ENTITY PHOTO1222 SYSTEM "1222.gif"> <!ATTLIST employee photo ENTITY #IMPLIED> ]> <document> <employee photo="PHOTO1221"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> . . . </employee> <employee photo="PHOTO1222"> <name> <lastname>Grant</lastname> <firstname>Cary</firstname> </name> <hiredate>October 20, 2005</hiredate> . . . </employee> </document>
Using ENTITY attributes is a good way of working with entities, and we'll talk about how that works later today. As part of that discussion, we'll talk about how to indicate to an XML processor what the format of the external data is; for instance, we'll elaborate on this example to indicate that the external entity uses the GIF image format.
The ENTITIES Attribute Type
Like the NMTOKEN attribute type, which has a plural type, NMTOKENS, the ENTITY attribute type also has a plural type, ENTITIES. Attributes of this type can hold lists of entity names, separated by whitespace. For example, to associate not just one photo but multiple photos with an employee, you could change the ENTITY attribute photo created in the previous example to an ENTITIES attribute named photos, like this:
<?xml version = "1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ENTITY PHOTO1221 SYSTEM "1221.gif"> <!ENTITY PHOTO1222 SYSTEM "1222.gif"> <!ATTLIST employee photos ENTITIES #IMPLIED> ]> <document> <employee photos="PHOTO1221 PHOTO1222"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> . . . </employee> </document>
The NOTATION Attribute Type
The last legal attribute type is NOTATION. You can assign to NOTATION attribute values that you have declared to be notations. Notations specify the format of non-XML data, and they're typically used to describe the storage format of external entities such as image files. For example, one popular type of notations is Multipurpose Internet Mail Extension (MIME) types, such as application/xml, text/html, image/jpeg, and so forth, which are often used to specify data storage formats.
When you want to declare a notation, you use the <!NOTATION> element in a DTD like this:
<!NOTATION name SYSTEM "external_id">
Here, name is the name of the notation and external_id is the identification you want to use for the notation, such as a MIME type.
You can also use the PUBLIC keyword for public notations if you supply a formal public identifier (FPI; see Day 4, "Creating Valid XML Documents: DTDs," for the rules on constructing FPIs), like this:
<!NOTATION name PUBLIC FPI "external_id">
The following example declares three standard notations—jpg, gif, and text, which stand for the MIME types image/jpeg, image/gif, and text/plain:
<?xml version = "1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!NOTATION jpg SYSTEM "image/jpeg"> <!NOTATION gif SYSTEM "image/gif"> <!NOTATION text SYSTEM "text/plain"> . . .
Now you can create an attribute named, say, imagetype, of type NOTATION. You can then assign either the gif or jpg notations to imagetype:
<?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!NOTATION jpg SYSTEM "image/jpeg"> <!NOTATION gif SYSTEM "image/gif"> <!NOTATION text SYSTEM "text/plain"> <!ATTLIST employee photo NMTOKEN #IMPLIED imagetype NOTATION (jpg | gif) #IMPLIED> ]> . . .
Now that you have declared a new attribute, imagetype, of the NOTATION type, you can put this attribute to work, like this:
<?xml version = "1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE document [ <!ELEMENT document (employee)*> <!ELEMENT employee (name, hiredate, projects)> <!ELEMENT name (lastname, firstname)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT hiredate (#PCDATA)> <!ELEMENT projects (project)*> <!ELEMENT project (product, id, price)> <!ELEMENT product (#PCDATA)> <!ELEMENT id (#PCDATA)> <!ELEMENT price (#PCDATA)> <!NOTATION jpg SYSTEM "image/jpeg"> <!NOTATION gif SYSTEM "image/gif"> <!NOTATION text SYSTEM "text/plain"> <!ATTLIST employee photo NMTOKEN #IMPLIED imagetype NOTATION (jpg | gif) #IMPLIED> ]> <document> <employee photo="1221.gif" imagetype ="gif"> <name> <lastname>Kelly</lastname> <firstname>Grace</firstname> </name> <hiredate>October 15, 2005</hiredate> <projects> <project> <product>Printer</product> <id>111</id> <price>$111.00</price> </project> <project> <product>Laptop</product> <id>222</id> <price>$989.00</price> </project> </projects> </employee> </document>