- Sams Teach Yourself XML in 21 Days, Third Edition
- Table of Contents
- About the Author
- Acknowledgments
- We Want to Hear from You!
- Introduction
- Part I: At a Glance
- Day 1. Welcome to XML
- All About Markup Languages
- All About XML
- Looking at XML in a Browser
- Working with XML Data Yourself
- Structuring Your Data
- Creating Well-Formed XML Documents
- Creating Valid XML Documents
- How XML Is Used in the Real World
- Online XML Resources
- Summary
- Q&A
- Workshop
- Day 2. Creating XML Documents
- Choosing an XML Editor
- Using XML Browsers
- Using XML Validators
- Creating XML Documents Piece by Piece
- Creating Prologs
- Creating an XML Declaration
- Creating XML Comments
- Creating Processing Instructions
- Creating Tags and Elements
- Creating CDATA Sections
- Handling Entities
- Summary
- Q&A
- Workshop
- Day 3. Creating Well-Formed XML Documents
- What Makes an XML Document Well-Formed?
- Creating an Example XML Document
- Understanding the Well-Formedness Constraints
- Using XML Namespaces
- Understanding XML Infosets
- Understanding Canonical XML
- Summary
- Q&A
- Workshop
- Day 4. Creating Valid XML Documents: DTDs
- All About DTDs
- Validating a Document by Using a DTD
- Creating Element Content Models
- Commenting a DTD
- Supporting External DTDs
- Handling Namespaces in DTDs
- Summary
- Q&A
- Workshop
- Declaring Attributes in DTDs
- Day 5. Handling Attributes and Entities in DTDs
- Specifying Default Values
- Specifying Attribute Types
- Handling Entities
- Summary
- Q&A
- Workshop
- Day 6. Creating Valid XML Documents: XML Schemas
- Using XML Schema Tools
- Creating XML Schemas
- Dissecting an XML Schema
- The Built-in XML Schema Elements
- Creating Elements and Types
- Specifying a Number of Elements
- Specifying Element Default Values
- Creating Attributes
- Summary
- Q&A
- Workshop
- Day 7. Creating Types in XML Schemas
- Restricting Simple Types by Using XML Schema Facets
- Creating XML Schema Choices
- Using Anonymous Type Definitions
- Declaring Empty Elements
- Declaring Mixed-Content Elements
- Grouping Elements Together
- Grouping Attributes Together
- Declaring all Groups
- Handling Namespaces in Schemas
- Annotating an XML Schema
- Summary
- Q&A
- Workshop
- Part I. In Review
- Well-Formed Documents
- Valid Documents
- Part II: At a Glance
- Day 8. Formatting XML by Using Cascading Style Sheets
- Our Sample XML Document
- Introducing CSS
- Connecting CSS Style Sheets and XML Documents
- Creating Style Sheet Selectors
- Using Inline Styles
- Creating Style Rule Specifications in Style Sheets
- Summary
- Q&A
- Workshop
- Day 9. Formatting XML by Using XSLT
- Introducing XSLT
- Transforming XML by Using XSLT
- Writing XSLT Style Sheets
- Using <xsl:apply-templates>
- Using <xsl:value-of> and <xsl:for-each>
- Matching Nodes by Using the match Attribute
- Working with the select Attribute and XPath
- Using <xsl:copy>
- Using <xsl:if>
- Using <xsl:choose>
- Specifying the Output Document Type
- Summary
- Q&A
- Workshop
- Day 10. Working with XSL Formatting Objects
- Introducing XSL-FO
- Using XSL-FO
- Using XSL Formatting Objects and Properties
- Building an XSL-FO Document
- Handling Inline Formatting
- Formatting Lists
- Formatting Tables
- Summary
- Q&A
- Workshop
- Part II. In Review
- Using CSS
- Using XSLT
- Using XSL-FO
- Part III: At a Glance
- Day 11. Extending HTML with XHTML
- Why XHTML?
- Writing XHTML Documents
- Validating XHTML Documents
- The Basic XHTML Elements
- Organizing Text
- Formatting Text
- Selecting Fonts: <font>
- Comments: <!-->
- Summary
- Q&A
- Workshop
- Day 12. Putting XHTML to Work
- Creating Hyperlinks: <a>
- Linking to Other Documents: <link>
- Handling Images: <img>
- Creating Frame Documents: <frameset>
- Creating Frames: <frame>
- Creating Embedded Style Sheets: <style>
- Formatting Tables: <table>
- Creating Table Rows: <tr>
- Formatting Table Headers: <th>
- Formatting Table Data: <td>
- Extending XHTML
- Summary
- Q&A
- Workshop
- Day 13. Creating Graphics and Multimedia: SVG and SMIL
- Introducing SVG
- Creating an SVG Document
- Creating Rectangles
- Adobe's SVG Viewer
- Using CSS Styles
- Creating Circles
- Creating Ellipses
- Creating Lines
- Creating Polylines
- Creating Polygons
- Creating Text
- Creating Gradients
- Creating Paths
- Creating Text Paths
- Creating Groups and Transformations
- Creating Animation
- Creating Links
- Creating Scripts
- Embedding SVG in HTML
- Introducing SMIL
- Summary
- Q&A
- Workshop
- Day 14. Handling XLinks, XPointers, and XForms
- Introducing XLinks
- Beyond Simple XLinks
- Introducing XPointers
- Introducing XBase
- Introducing XForms
- Summary
- Workshop
- Part III. In Review
- Part IV: At a Glance
- Day 15. Using JavaScript and XML
- Introducing the W3C DOM
- Introducing the DOM Objects
- Working with the XML DOM in JavaScript
- Searching for Elements by Name
- Reading Attribute Values
- Getting All XML Data from a Document
- Validating XML Documents by Using DTDs
- Summary
- Q&A
- Workshop
- Day 16. Using Java and .NET: DOM
- Using Java to Read XML Data
- Finding Elements by Name
- Creating an XML Browser by Using Java
- Navigating Through XML Documents
- Writing XML by Using Java
- Summary
- Q&A
- Workshop
- Day 17. Using Java and .NET: SAX
- An Overview of SAX
- Using SAX
- Using SAX to Find Elements by Name
- Creating an XML Browser by Using Java and SAX
- Navigating Through XML Documents by Using SAX
- Writing XML by Using Java and SAX
- Summary
- Q&A
- Workshop
- Day 18. Working with SOAP and RDF
- Introducing SOAP
- A SOAP Example in .NET
- A SOAP Example in Java
- Introducing RDF
- Summary
- Q&A
- Workshop
- Part IV. In Review
- Part V: At a Glance
- Day 19. Handling XML Data Binding
- Introducing DSOs
- Binding HTML Elements to HTML Data
- Binding HTML Elements to XML Data
- Binding HTML Tables to XML Data
- Accessing Individual Data Fields
- Binding HTML Elements to XML Data by Using the XML DSO
- Binding HTML Tables to XML Data by Using the XML DSO
- Searching XML Data by Using a DSO and JavaScript
- Handling Hierarchical XML Data
- Summary
- Q&A
- Workshop
- Day 20. Working with XML and Databases
- XML, Databases, and ASP
- Storing Databases as XML
- Using XPath with a Database
- Introducing XQuery
- Summary
- Q&A
- Workshop
- Day 21. Handling XML in .NET
- Creating and Editing an XML Document in .NET
- From XML to Databases and Back
- Reading and Writing XML in .NET Code
- Using XML Controls to Display Formatted XML
- Creating XML Web Services
- Summary
- Q&A
- Workshop
- Part V. In Review
- Appendix A. Quiz Answers
- Quiz Answers for Day 1
- Quiz Answers for Day 2
- Quiz Answers for Day 3
- Quiz Answers for Day 4
- Quiz Answers for Day 5
- Quiz Answers for Day 6
- Quiz Answers for Day 7
- Quiz Answers for Day 8
- Quiz Answers for Day 9
- Quiz Answers for Day 10
- Quiz Answers for Day 11
- Quiz Answers for Day 12
- Quiz Answers for Day 13
- Quiz Answers for Day 14
- Quiz Answers for Day 15
- Quiz Answers for Day 16
- Quiz Answers for Day 17
- Quiz Answers for Day 18
- Quiz Answers for Day 19
- Quiz Answers for Day 20
- Quiz Answers for Day 21
Using Java to Read XML Data
You're going to create a Java application to read in and extract all the data from the XML document you worked with yesterday, which you can see in Listing 16.1.
Example 16.1. A Sample XML Document for Today's Work (ch16_01.xml)
<?xml version="1.0" encoding="UTF-8"?> <session> <committee type="monetary"> <title>Finance</title> <number>17</number> <subject>Donut Costs</subject> <date>7/15/2005</date> <attendees> <senator status="present"> <firstName>Thomas</firstName> <lastName>Smith</lastName> </senator> <senator status="absent"> <firstName>Frank</firstName> <lastName>McCoy</lastName> </senator> <senator status="present"> <firstName>Jay</firstName> <lastName>Jones</lastName> </senator> </attendees> </committee> </session>
The Java application you'll create will show how to read in the entire XML document and display it, much as we did in JavaScript yesterday. You start by creating a DocumentBuilderFactory object, which you'll use to create a DocumentBuilder object, and that object will actually read in the XML document. Here's how to use DocumentBuilderFactory in the new application's main method, which is called when the application starts:
import javax.xml.parsers.*; import org.w3c.dom.*; public class ch16_02 { public static void main(String args[]) { try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); . . . }
Table 16.1 lists the methods for the DocumentBuilderFactory class.
Table 16.1. Methods of the javax.xml.parsers.DocumentBuilderFactory Class
Method |
What It Does |
protected DocumentBuilderFactory() |
Acts as the default DocumentBuilderFactory constructor. |
abstract Object getAttribute(String name) |
Returns attribute values. |
boolean isCoalescing() |
Returns True if the factory is configured to produce parsers that convert CDATA nodes to text nodes. |
boolean isExpandEntityReferences() |
Returns True if the factory is configured to produce parsers that expand XML entity reference nodes. |
boolean isIgnoringComments() |
Returns True if the factory is configured to produce parsers that ignore comments. |
boolean isIgnoringElementContentWhitespace() |
Returns True if the factory is configured to produce parsers that ignore ignorable whitespace (such as that used to indent elements) in element content. |
boolean isNamespaceAware() |
Returns True if the factory is configured to produce parsers that can use XML namespaces. |
boolean isValidating() |
Returns True if the factory is configured to produce parsers that validate the XML content during parsing operations. |
abstract DocumentBuilder newDocumentBuilder() |
Returns a new DocumentBuilder object. |
static DocumentBuilderFactory newInstance() |
Returns a new DocumentBuilderFactory object. |
abstract void setAttribute-(String name, Object value) |
Sets attribute values. |
void setCoalescing(boolean coalescing) |
Specifies that the parser produced will convert CDATA nodes to text nodes. |
void setExpandEntityReferences-(boolean expandEntityRef) |
Specifies that the parser produced will expand XML entity reference nodes. |
void setIgnoringComments-(boolean ignoreComments) |
Specifies that the parser produced will ignore comments. |
void setIgnoringElementContentWhitespace-(boolean whitespace) |
Specifies that the parser produced must eliminate ignorable whitespace. |
void setNamespaceAware(boolean awareness) |
Specifies that the parser produced will provide support for XML namespaces. |
void setValidating(boolean validating) |
Specifies that the parser produced will validate documents as they are parsed. |
To actually parse the XML document and extract data from it, we need a DocumentBuilder object, which is created by the DocumentBuilderFactory object. Here's what that looks like in code:
public class ch16_02 { public static void main(String args[]) { try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = null; try { builder = factory.newDocumentBuilder(); } catch (ParserConfigurationException e) {} . . . }
Table 16.2 lists the methods of the DocumentBuilder class.
Table 16.2. Methods of the javax.xml.parsers.DocumentBuilder Class
Method |
What It Does |
protected DocumentBuilder() |
Acts as the default DocumentBuilder constructor. |
abstract boolean isNamespaceAware() |
Returns True if this parser is configured to understand namespaces. |
abstract boolean isValidating() |
Returns True if this parser is configured to validate XML documents. |
abstract Document newDocument() |
Returns a new instance of a DOM Document object to build a DOM tree with. |
Document parse(File f) |
Indicates to parse the content of the file as an XML document and return a new DOM Document object. |
Document parse(InputStream is) |
Indicates to parse the content of a given InputStream object as an XML document and return a new DOM Document object. |
Document parse(InputStream is, String systemId) |
Indicates to parse the content of an InputStream object as an XML document and return a new DOM Document object. |
Document parse(String uri) |
Indicates to parse the content of a URI as an XML document and return a new DOM Document object. |
abstract void setErrorHandler(ErrorHandler eh) |
Sets the ErrorHandler object to be used to report errors. |
When the user starts the Java application ch16_02.class, he or she will type the name of the XML document to read, as we do in the following example, where we want to read and display the data in ch16_01.xml:
%java ch16_02 ch16_01.xml
You can access the name of the XML document the user wants to read as args[0]. Here's how to create a Java Document object that corresponds to the XML document, using the DocumentBuilder object:
import javax.xml.parsers.*; import org.w3c.dom.*; public class ch16_02 { public static void main(String args[]) { try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = null; try { builder = factory.newDocumentBuilder(); } catch (ParserConfigurationException e) {} Document document = null; document = builder.parse(args[0]); . . . }
At this point, then, we have a Java Document object (actually an org.w3c.dom.Document object) that corresponds to the XML document, and we can use the various methods of that object to work with the XML document. Table 16.3 lists the methods of Document objects.
Table 16.3. Methods of the org.w3c.dom.Document Interface
Method |
What It Does |
Attr createAttribute(String name) |
Returns a new attribute object. |
Attr createAttributeNS-(String namespaceURI, String qualifiedName) |
Returns a new attribute that has the given name and namespace. |
CDATASection createCDATASection(String data) |
Returns a new CDATASection node whose value is the given string. |
Comment createComment(String data) |
Returns a new Comment node created using the given string. |
DocumentFragment createDocumentFragment() |
Returns a new empty DocumentFragment object. |
Element createElement(String tagName) |
Returns a new element of the type given. |
Element createElementNS-(String namespaceURI, String qualifiedName) |
Returns a new element of the given qualified name and namespace URI. |
ProcessingInstruction createProcessingInstruction-(String target, String data) |
Returns a new ProcessingInstruction node. |
Text createTextNode(String data) |
Returns a new text node, given the specified string. |
DocumentType getDoctype() |
Returns the DTD for this document. |
Element getDocumentElement() |
Gives direct access to document element. |
Element getElementById(String elementId) |
Returns an element whose ID is given. |
NodeList getElementsByTagName(String tagname) |
Returns all elements with a given tag name. |
NodeList getElementsByTagNameNS-(String namespaceURI, String localName) |
Returns all elements with a given name and namespace. |
Node importNode(Node importedNode, boolean deep) |
Imports a node from another document. |
The next step is to work through the XML document recursively, as you did with JavaScript yesterday. You'll do that in a method named childLoop that you can call recursively. Just as you did with JavaScript, you'll also pass an indentation string to this method, which will be increased for each successive generation of a node's children. This method will fill an array of strings, displayText, with the XML data from the document and store the total number of strings in the array in a variable named numberLines. When childLoop is done filling the array of strings, you'll display them, like this:
import javax.xml.parsers.*; import org.w3c.dom.*; public class ch16_02 { static String displayText[] = new String[1000]; static int numberLines = 0; public static void main(String args[]) { try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = null; try { builder = factory.newDocumentBuilder(); } catch (ParserConfigurationException e) {} Document document = null; document = builder.parse(args[0]); childLoop(document, ""); } catch (Exception e) { e.printStackTrace(System.err); } for(int loopIndex = 0; loopIndex < numberLines; loopIndex++){ System.out.println(displayText[loopIndex]); } }
The next order of business is to write childLoop, the method that will loop over all nodes in the XML document and store their data in the displayText array.
Looping Over Nodes
As in JavaScript, you're using the W3C DOM in Java today, so you're treating our XML document as a tree of nodes. Table 16.4 lists the fields of Java org.w3c.dom.Node objects, and Table 16.5 lists the methods of Java org.w3c.dom.Node objects.
Table 16.4. The Fields of the org.w3c.dom.Node Object
Field Summary |
Stands For |
static short ATTRIBUTE_NODE |
An attribute |
static short CDATA_SECTION_NODE |
A CDATA section |
static short COMMENT_NODE |
A comment |
static short DOCUMENT_FRAGMENT_NODE |
A document fragment |
static short DOCUMENT_NODE |
A document |
static short DOCUMENT_TYPE_NODE |
A DTD |
static short ELEMENT_NODE |
An element |
static short ENTITY_NODE |
An entity |
static short ENTITY_REFERENCE_NODE |
An entity reference |
static short NOTATION_NODE |
A notation |
static short PROCESSING_INSTRUCTION_NODE |
A processing instruction |
static short TEXT_NODE |
A text node |
Table 16.5. Methods of the org.w3c.dom.Node Interface
Method |
What It Does |
Node appendChild(Node newChild) |
Appends the given node to the end of the children of the current node. |
NamedNodeMap getAttributes() |
Returns the attributes of an element node. |
NodeList getChildNodes() |
Gets the children of this node. |
Node getFirstChild() |
Gets the first child of this node. |
Node getLastChild() |
Gets the last child of this node. |
String getLocalName() |
Gets the local part of the full name of this node. |
String getNamespaceURI() |
Gets the namespace URI of this node. |
Node getNextSibling() |
Gets the node following this node. |
String getNodeName() |
Gets the name of this node. |
short getNodeType() |
Gets the type of this node. |
String getNodeValue() |
Gets the value of this node. |
Document getOwnerDocument() |
Gets the Document object for this node. |
Node getParentNode() |
Gets the parent of this node. |
String getPrefix() |
Gets the namespace prefix of this node. |
Node getPreviousSibling() |
Gets the node preceding the current node. |
boolean hasAttributes() |
Returns True if this node has any attributes. |
boolean hasChildNodes() |
Returns True if this node has any children. |
Node insertBefore(Node newChild, Node refChild) |
Inserts the new node before a reference child node. |
void normalize() |
Transforms all text nodes into XML normalized form. |
Node removeChild(Node oldChild) |
Removes a child node and returns it. |
Node replaceChild(Node newChild, Node oldChild) |
Replaces the child node. |
void setNodeValue(String nodeValue) |
Sets the node's value. |
void setPrefix(String prefix) |
Sets the namespace prefix of the node. |
The childLoop method has a node and an indentation string passed to it. To handle the current node and store its data in the displayText array, first check whether the current node is valid, and if it is, get its type by using the getNodeType method:
public static void childLoop(Node node, String indentation) { if (node == null) { return; } int type = node.getNodeType(); . . . }
Now that you know the type of the node you've been passed, how do you handle it and store its data in the array of strings that will be printed out? You have to handle different types of nodes in different ways, and in this case, you'll use a Java switch statement to work with different node types, starting with the document node itself.
Handling Document Nodes
You can compare the type of the current node to the fields listed in Table 16.4 to determine what kind of node you're dealing with. For example, if the current node is a document node, you'll just put a generic XML declaration into the display string's array, storing that text in the displayText array and incrementing the array's index value, numberLines, like this:
public static void childLoop(Node node, String indentation) { if (node == null) { return; } int type = node.getNodeType(); switch (type) { case Node.DOCUMENT_NODE: { displayText[numberLines] = indentation; displayText[numberLines] += "<?xml version=\"1.0\" encoding=\""+ "UTF-8" + "\"?>"; numberLines++; childLoop(((Document)node).getDocumentElement(), ""); break; } . . .
Now you've displayed a generic XML declaration for the start of the XML document. Next, you'll handle elements.
Handling Elements
Elements have the type Node.ELEMENT_NODE, and you can get the name of the element by using the W3C DOM method getNodeName. Here's what it looks like in the childLoop method's switch statement, which lets you handle the various node types:
case Node.ELEMENT_NODE: { displayText[numberLines] = indentation; displayText[numberLines] += "<"; displayText[numberLines] += node.getNodeName(); . . .
This gives us the name of the current element, but what if it has attributes? You'll check that next.
Handling Attributes
To see whether the element you're working on has any attributes, you can use the getAttributes method, which returns NamedNodeMap object, which contains the element's attributes. If there are any attributes, you'll store them in an array and then use the getNodeName method to get the attribute's name, and you'll use the getNodeValue method to get the attribute's value:
int length = (node.getAttributes() != null) ? node.getAttributes().getLength() : 0; Attr attributes[] = new Attr[length]; for (int loopIndex = 0; loopIndex < length; loopIndex++) { attributes[loopIndex] = (Attr)node.getAttributes().item(loopIndex); } for (int loopIndex = 0; loopIndex < attributes.length; loopIndex++) { Attr attribute = attributes[loopIndex]; displayText[numberLines] += " "; displayText[numberLines] += attribute.getNodeName(); displayText[numberLines] += "=\""; displayText[numberLines] += attribute.getNodeValue(); displayText[numberLines] += "\""; } displayText[numberLines] += ">"; numberLines++;
Table 16.6 lists the methods of NamedNodeMap.
Table 16.6. NamedNodeMap Methods
Method |
What It Does |
int getLength() |
Returns the number of nodes. |
Node getNamedItem(java.lang.String name) |
Gets a node specified by the name. |
Node getNamedItemNS(java.lang.String namespaceURI, java.lang.String localName) |
Gets a node specified by the local name and namespace URI. |
Node item(int index) |
Gets an item in the map by index. |
Node removeNamedItem(java.lang.String name) |
Removes a node. |
Node removeNamedItemNS(java.lang.String namespaceURI, java.lang.String localName) |
Removes the given node with a local name and namespace URI. |
Table 16.7 lists the methods of Attr objects, which hold attributes.
Table 16.7. Attr Interface Methods
Method |
What It Does |
java.lang.String getName() |
Gets the name of this attribute. |
Element getOwnerElement() |
Gets this attribute's element node. |
boolean getGiven() |
Returns True if this attribute was given a value in the original document. |
java.lang.String getValue() |
Gets the value of the attribute. |
void setValue(String value) |
Sets the value of the attribute. |
Now you've handled the current element's name and attributes. But what if the element has child nodes, such as text nodes or child elements? That's coming up next.
Handling Child Nodes
Elements can have child nodes, so before you finish up with elements, you'll also loop over those child nodes by calling the childLoop again recursively. You can use the following to get a NodeList interface of child nodes by using the getChildNodes method, increase the indentation level by four spaces, and call childLoop for each child node:
NodeList childNodes = node.getChildNodes(); if (childNodes != null) { length = childNodes.getLength(); indentation += " "; for (int loopIndex = 0; loopIndex < length; loopIndex++ ) { childLoop(childNodes.item(loopIndex), indentation); } } break; }
The NodeList interface supports an ordered collection of nodes. Table 26.8 lists the methods of the NodeList interface.
Table 16.8. NodeList Methods
Method |
What It Does |
int getLength() |
Returns the number of nodes. |
Node item(int index) |
Gets the item at a specified index. |
Now that you have handled elements, attributes, and child nodes, you'll take a look at how to work with text nodes.
Handling Text Nodes
Text nodes are of type Node.TEXT_NODE, and after you check to make sure a node is a valid text node, you can trim extra spaces (such as indentation text) from the text node's value and add it to the displayText array, like this:
case Node.TEXT_NODE: { displayText[numberLines] = indentation; String trimmedText = node.getNodeValue().trim(); if(trimmedText.indexOf("\n") < 0 && trimmedText.length() > 0) { displayText[numberLines] += trimmedText; numberLines++; } break; }
Handling Processing Instructions
Handling processing instructions is not difficult; you just use getNodeName to get the processing instruction and the getNodeValue method to get the processing instruction's data. Here's how that works in the childLoop method's switch statement:
case Node.PROCESSING_INSTRUCTION_NODE: { displayText[numberLines] = indentation; displayText[numberLines] += "<?"; displayText[numberLines] += node.getNodeName(); String text = node.getNodeValue(); if (text != null && text.length() > 0) { displayText[numberLines] += text; } displayText[numberLines] += "?>"; numberLines++; break; }
Handling CDATA Sections
Handling CDATA sections is just as easy as handling other nodes: You just use the getNodeValue method to get the CDATA section's data. Here's what that looks like in the childLoop method's switch statement:
case Node.CDATA_SECTION_NODE: { displayText[numberLines] = indentation; displayText[numberLines] += "<![CDATA["; displayText[numberLines] += node.getNodeValue(); displayText[numberLines] += "]]>"; numberLines++; break; } }
Ending Elements
Our last task is to add a closing tag for element nodes. Up to this point, you've only displayed an opening tag for each element and its attributes, but no closing tag. Here's how to add the closing tag with some code at the end of the childLoop method:
if (type == Node.ELEMENT_NODE) { displayText[numberLines] = indentation.substring(0, indentation.length() - 4); displayText[numberLines] += "</"; displayText[numberLines] += node.getNodeName(); displayText[numberLines] += ">"; numberLines++; indentation += " "; } }
That's it; Listing 16.2 shows all the code in ch16_02.java.
Example 16.2. Parsing XML Documents by Using Java (ch16_02.java)
import javax.xml.parsers.*; import org.w3c.dom.*; public class ch16_02 { static String displayText[] = new String[1000]; static int numberLines = 0; public static void main(String args[]) { try { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = null; try { builder = factory.newDocumentBuilder(); } catch (ParserConfigurationException e) {} Document document = null; document = builder.parse(args[0]); childLoop(document, ""); } catch (Exception e) { e.printStackTrace(System.err); } for(int loopIndex = 0; loopIndex < numberLines; loopIndex++){ System.out.println(displayText[loopIndex]) ; } } public static void childLoop(Node node, String indentation) { if (node == null) { return; } int type = node.getNodeType(); switch (type) { case Node.DOCUMENT_NODE: { displayText[numberLines] = indentation; displayText[numberLines] += "<?xml version=\"1.0\" encoding=\""+ "UTF-8" + "\"?>"; numberLines++; childLoop(((Document)node).getDocumentElement(), ""); break; } case Node.ELEMENT_NODE: { displayText[numberLines] = indentation; displayText[numberLines] += "<"; displayText[numberLines] += node.getNodeName(); int length = (node.getAttributes() != null) ? node.getAttributes().getLength() : 0; Attr attributes[] = new Attr[length]; for (int loopIndex = 0; loopIndex < length; loopIndex++) { attributes[loopIndex] = (Attr)node.getAttributes().item(loopIndex); } for (int loopIndex = 0; loopIndex < attributes.length; loopIndex++) { Attr attribute = attributes[loopIndex]; displayText[numberLines] += " "; displayText[numberLines] += attribute.getNodeName(); displayText[numberLines] += "=\""; displayText[numberLines] += attribute.getNodeValue(); displayText[numberLines] += "\""; } displayText[numberLines] += ">"; numberLines++; NodeList childNodes = node.getChildNodes(); if (childNodes != null) { length = childNodes.getLength(); indentation += " "; for (int loopIndex = 0; loopIndex < length; loopIndex++ ) { childLoop(childNodes.item(loopIndex), indentation); } } break; } case Node.TEXT_NODE: { displayText[numberLines] = indentation; String trimmedText = node.getNodeValue().trim(); if(trimmedText.indexOf("\n") < 0 && trimmedText.length() > 0){ displayText[numberLines] += trimmedText; numberLines++; } break; } case Node.PROCESSING_INSTRUCTION_NODE: { displayText[numberLines] = indentation; displayText[numberLines] += "<?"; displayText[numberLines] += node.getNodeName(); String text = node.getNodeValue(); if (text != null && text.length() > 0) { displayText[numberLines] += text; } displayText[numberLines] += "?>"; numberLines++; break; } case Node.CDATA_SECTION_NODE: { displayText[numberLines] = indentation; displayText[numberLines] += "<![CDATA["; displayText[numberLines] += node.getNodeValue(); displayText[numberLines] += "]]>"; numberLines++; break; } } if (type == Node.ELEMENT_NODE) { displayText[numberLines] = indentation.substring(0, indentation.length() - 4); displayText[numberLines] += "</"; displayText[numberLines] += node.getNodeName(); displayText[numberLines] += ">"; numberLines++; indentation += " "; } } }
Now compile ch16_02.java by using javac, the Java compiler:
%javac ch16_02.java
This creates ch16_02.class, which is ready to be run. Use the following to run this .class file to extract all the data from ch16_01.xml:
%java ch16_02 ch16_01.xml
Figure 16.1 shows the results of this example in a Windows MS-DOS window. So far today, you've been able to extract all the data in an XML document, format it, and display it.
Figure 16.1 Parsing an XML document by using Java.