- Sams Teach Yourself XML in 21 Days, Third Edition
- Table of Contents
- About the Author
- Acknowledgments
- We Want to Hear from You!
- Introduction
- Part I: At a Glance
- Day 1. Welcome to XML
- All About Markup Languages
- All About XML
- Looking at XML in a Browser
- Working with XML Data Yourself
- Structuring Your Data
- Creating Well-Formed XML Documents
- Creating Valid XML Documents
- How XML Is Used in the Real World
- Online XML Resources
- Summary
- Q&A
- Workshop
- Day 2. Creating XML Documents
- Choosing an XML Editor
- Using XML Browsers
- Using XML Validators
- Creating XML Documents Piece by Piece
- Creating Prologs
- Creating an XML Declaration
- Creating XML Comments
- Creating Processing Instructions
- Creating Tags and Elements
- Creating CDATA Sections
- Handling Entities
- Summary
- Q&A
- Workshop
- Day 3. Creating Well-Formed XML Documents
- What Makes an XML Document Well-Formed?
- Creating an Example XML Document
- Understanding the Well-Formedness Constraints
- Using XML Namespaces
- Understanding XML Infosets
- Understanding Canonical XML
- Summary
- Q&A
- Workshop
- Day 4. Creating Valid XML Documents: DTDs
- All About DTDs
- Validating a Document by Using a DTD
- Creating Element Content Models
- Commenting a DTD
- Supporting External DTDs
- Handling Namespaces in DTDs
- Summary
- Q&A
- Workshop
- Declaring Attributes in DTDs
- Day 5. Handling Attributes and Entities in DTDs
- Specifying Default Values
- Specifying Attribute Types
- Handling Entities
- Summary
- Q&A
- Workshop
- Day 6. Creating Valid XML Documents: XML Schemas
- Using XML Schema Tools
- Creating XML Schemas
- Dissecting an XML Schema
- The Built-in XML Schema Elements
- Creating Elements and Types
- Specifying a Number of Elements
- Specifying Element Default Values
- Creating Attributes
- Summary
- Q&A
- Workshop
- Day 7. Creating Types in XML Schemas
- Restricting Simple Types by Using XML Schema Facets
- Creating XML Schema Choices
- Using Anonymous Type Definitions
- Declaring Empty Elements
- Declaring Mixed-Content Elements
- Grouping Elements Together
- Grouping Attributes Together
- Declaring all Groups
- Handling Namespaces in Schemas
- Annotating an XML Schema
- Summary
- Q&A
- Workshop
- Part I. In Review
- Well-Formed Documents
- Valid Documents
- Part II: At a Glance
- Day 8. Formatting XML by Using Cascading Style Sheets
- Our Sample XML Document
- Introducing CSS
- Connecting CSS Style Sheets and XML Documents
- Creating Style Sheet Selectors
- Using Inline Styles
- Creating Style Rule Specifications in Style Sheets
- Summary
- Q&A
- Workshop
- Day 9. Formatting XML by Using XSLT
- Introducing XSLT
- Transforming XML by Using XSLT
- Writing XSLT Style Sheets
- Using <xsl:apply-templates>
- Using <xsl:value-of> and <xsl:for-each>
- Matching Nodes by Using the match Attribute
- Working with the select Attribute and XPath
- Using <xsl:copy>
- Using <xsl:if>
- Using <xsl:choose>
- Specifying the Output Document Type
- Summary
- Q&A
- Workshop
- Day 10. Working with XSL Formatting Objects
- Introducing XSL-FO
- Using XSL-FO
- Using XSL Formatting Objects and Properties
- Building an XSL-FO Document
- Handling Inline Formatting
- Formatting Lists
- Formatting Tables
- Summary
- Q&A
- Workshop
- Part II. In Review
- Using CSS
- Using XSLT
- Using XSL-FO
- Part III: At a Glance
- Day 11. Extending HTML with XHTML
- Why XHTML?
- Writing XHTML Documents
- Validating XHTML Documents
- The Basic XHTML Elements
- Organizing Text
- Formatting Text
- Selecting Fonts: <font>
- Comments: <!-->
- Summary
- Q&A
- Workshop
- Day 12. Putting XHTML to Work
- Creating Hyperlinks: <a>
- Linking to Other Documents: <link>
- Handling Images: <img>
- Creating Frame Documents: <frameset>
- Creating Frames: <frame>
- Creating Embedded Style Sheets: <style>
- Formatting Tables: <table>
- Creating Table Rows: <tr>
- Formatting Table Headers: <th>
- Formatting Table Data: <td>
- Extending XHTML
- Summary
- Q&A
- Workshop
- Day 13. Creating Graphics and Multimedia: SVG and SMIL
- Introducing SVG
- Creating an SVG Document
- Creating Rectangles
- Adobe's SVG Viewer
- Using CSS Styles
- Creating Circles
- Creating Ellipses
- Creating Lines
- Creating Polylines
- Creating Polygons
- Creating Text
- Creating Gradients
- Creating Paths
- Creating Text Paths
- Creating Groups and Transformations
- Creating Animation
- Creating Links
- Creating Scripts
- Embedding SVG in HTML
- Introducing SMIL
- Summary
- Q&A
- Workshop
- Day 14. Handling XLinks, XPointers, and XForms
- Introducing XLinks
- Beyond Simple XLinks
- Introducing XPointers
- Introducing XBase
- Introducing XForms
- Summary
- Workshop
- Part III. In Review
- Part IV: At a Glance
- Day 15. Using JavaScript and XML
- Introducing the W3C DOM
- Introducing the DOM Objects
- Working with the XML DOM in JavaScript
- Searching for Elements by Name
- Reading Attribute Values
- Getting All XML Data from a Document
- Validating XML Documents by Using DTDs
- Summary
- Q&A
- Workshop
- Day 16. Using Java and .NET: DOM
- Using Java to Read XML Data
- Finding Elements by Name
- Creating an XML Browser by Using Java
- Navigating Through XML Documents
- Writing XML by Using Java
- Summary
- Q&A
- Workshop
- Day 17. Using Java and .NET: SAX
- An Overview of SAX
- Using SAX
- Using SAX to Find Elements by Name
- Creating an XML Browser by Using Java and SAX
- Navigating Through XML Documents by Using SAX
- Writing XML by Using Java and SAX
- Summary
- Q&A
- Workshop
- Day 18. Working with SOAP and RDF
- Introducing SOAP
- A SOAP Example in .NET
- A SOAP Example in Java
- Introducing RDF
- Summary
- Q&A
- Workshop
- Part IV. In Review
- Part V: At a Glance
- Day 19. Handling XML Data Binding
- Introducing DSOs
- Binding HTML Elements to HTML Data
- Binding HTML Elements to XML Data
- Binding HTML Tables to XML Data
- Accessing Individual Data Fields
- Binding HTML Elements to XML Data by Using the XML DSO
- Binding HTML Tables to XML Data by Using the XML DSO
- Searching XML Data by Using a DSO and JavaScript
- Handling Hierarchical XML Data
- Summary
- Q&A
- Workshop
- Day 20. Working with XML and Databases
- XML, Databases, and ASP
- Storing Databases as XML
- Using XPath with a Database
- Introducing XQuery
- Summary
- Q&A
- Workshop
- Day 21. Handling XML in .NET
- Creating and Editing an XML Document in .NET
- From XML to Databases and Back
- Reading and Writing XML in .NET Code
- Using XML Controls to Display Formatted XML
- Creating XML Web Services
- Summary
- Q&A
- Workshop
- Part V. In Review
- Appendix A. Quiz Answers
- Quiz Answers for Day 1
- Quiz Answers for Day 2
- Quiz Answers for Day 3
- Quiz Answers for Day 4
- Quiz Answers for Day 5
- Quiz Answers for Day 6
- Quiz Answers for Day 7
- Quiz Answers for Day 8
- Quiz Answers for Day 9
- Quiz Answers for Day 10
- Quiz Answers for Day 11
- Quiz Answers for Day 12
- Quiz Answers for Day 13
- Quiz Answers for Day 14
- Quiz Answers for Day 15
- Quiz Answers for Day 16
- Quiz Answers for Day 17
- Quiz Answers for Day 18
- Quiz Answers for Day 19
- Quiz Answers for Day 20
- Quiz Answers for Day 21
Writing XHTML Documents
As an XML author, there are a few rules you need to know when it comes to writing XHTML documents. The following are the requirements a document must meet to be an XHTML document, according to the W3C:
- The document element must be <html>.
- The XHTML document must validate against one of the W3C XHTML DTDs.
- The document element, <html>, must use the http://www.w3.org/1999/xhtml namespace, using the xmlns attribute.
- The document must have a <!DOCTYPE> element, and it must appear before the document element.
Here's a list of some of the main things you, as HTML authors, need to watch out for when creating XHTML documents:
- Element and attribute names have to be in lowercase.
- Attribute values must be in quotes.
- Non-empty elements need end tags. While you can sometimes omit end tags for non-empty elements in HTML, you can't in XHTML.
- You cannot use standalone attributes (that is, attributes that are not assigned values) in XHTML. If you have to, you can assign a dummy value to an attribute (for example, noborder = "noborder").
- An empty element needs to be ended with />. The HTML browsers don't have a problem with this ending (as opposed to just >).
- The <a> element may not contain other <a> elements.
- The <button> element may not contain the <input>, <select>, <textarea>, <label>, <button>, <form>, <fieldset>, <iframe>, or <isindex> elements.
- The <form> element may not contain other <form> elements.
- The <label> element may not contain other <label> elements.
- The <pre> element may not contain <img>, <object>, <big>, <small>, <sub>, or <sup> elements.
- You can use the id attribute, but you cannot use the name attribute. In XHTML 1.0, the name attribute of the <a>, <applet>, <form>, <frame>, <iframe>, <img>, and <map> elements has been deprecated. This can be a problem because browsers such as Netscape Navigator support name but not id (in which case the best solution is to use both attributes in the same element, even though it's not legal XHTML).
- You must escape sensitive characters. For example, when an attribute value contains an ampersand (&), the ampersand should be given as the entity reference &.
Tomorrow you'll talk about a few more requirements (for example, if you use < characters in <SCRIPT> elements, you should either escape such characters as < or, if the browser can't handle that, place the script in an external file).
Dissecting the Example
Now let's start taking apart the XHTML document ch11_02.html to see what makes XHTML tick.
You start as you would in any XML document, with an XML declaration:
<?xml version="1.0" encoding="UTF-8"?> . . .
The next element is the <!DOCTYPE> element, to indicate which XHTML DTD you're using—in this case, XHTML 1.0 Transitional (which is the closest to the version of HTML in general use):
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> . . .
This is just a standard <!DOCTYPE> element, and it indicates that the document element in the XHTML document is <html>. Remember that there is a different DTD for each version of XHTML, and they're all public DTDs, created by the W3C. The formal public identifier (FPI) for this DTD is "-//W3C//DTD XHTML 1.0 Transitional//EN", which is the DTD for XHTML 1.0 Transitional. You also list the URI for this DTD, for the benefit of XML processors:
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
These are the <!DOCTYPE> elements you should use in XHTML 1.0 for the Strict, Transitional, and Frameset DTDs, respectively:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
Note that if you're validating XHTML documents against these DTDs, you can download them and store them locally for faster access. For example, if you store these DTDs in a directory named storage, it might look like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "storage/xhtml1-strict.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "storage/xhtml1-transitional.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "storage/xhtml1-frameset.dtd">
Here's the <!DOCTYPE> element for XHTML 1.1 (there's only one XHTML 1.1 DTD, not three, as in XHTML 1.0, because XHTML uses Strict XHTML and doesn't have any Transitional forms):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
Here's the <!DOCTYPE> element for XML Basic:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd">
And here's the <!DOCTYPE> element for XHTML 2.0 (note that the XHTML 2.0 DTD hasn't been posted yet, so the W3C lists the URI as to-be-determined, "TBD"):
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 2.0//EN" "TBD">
The DTDs at the URIs given by these <!DOCTYPE> elements are real DTDs and will work in XML processors. If possible, you should download them and use them locally, however. Imagine the bottleneck that would result from a million browsers all trying to download these DTDs at once.
Following the <!DOCTYPE> element is the <html> element, which is the document element for all XHTML documents. Note the lowercase here—<html>, not <HTML>. All elements in XHTML (except the <!DOCTYPE> element) are lowercase. That's the way XHTML works, and if you're used to using uppercase HTML tag names, XHTML tags will take a little adjustment. Here's what the <html> element looks like:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> . . .
In this case, you're putting the entire document into the http://www.w3.org/1999/xhtml namespace, which is the official W3C namespace for XHTML documents. This element also has an xml:lang attribute, to set the language for the document when it's interpreted as XML, and the standard HTML attribute lang, to set the language when the document is treated as HTML.
The rest of this XHTML example is very much like its HTML counterpart, with the exceptions that all element names are in lowercase and the <BR> element has become the more proper <br/> XHTML element:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title> An XHTML Document </title> </head> <body> <h1> Long Live XHTML! </h1> This is an XHTML document. <br/> Pretty good, eh? </body> </html>
That's your first XHTML document. So how about validating it?