Home > Articles > Web Services > XML

Sams Teach Yourself XML in 21 Days

Sep 16, 2005

📄 Contents

␡

⎙ Print

< Back Page 234 of 288 Next >

Recommended Book 

Sams Teach Yourself XML in 21 Days, 3rd Edition

Learn More Buy

Part IV. In Review

In Part IV we took a look at programming with XML, beginning with JavaScript. We saw that you can use JavaScript with the W3C DOM, and we saw that there are various levels of the DOM available.

When you load an XML document, you can use JavaScript properties such as nextChild and previousSibling to move through the document. It's also common to loop over nodes and search for the data you want. Let's look at an example that illustrates looping over nodes. Say you have the following XML document, which contains data about some of your clients and the programming applications you're writing for them:

<?xml version = "1.0" standalone="yes"?>
<document>
    <client>
        <name>
            <lastname>Kirk</lastname>
            <firstname>James</firstname>
        </name>
        <contractDate>September 5, 2092</contractDate>
        <contracts>
            <contract>
                <app>Comm</app>
                <id>111</id>
                <fee>$111.00</fee>
            </contract>
            <contract>
                <app>Accounting</app>
                <id>222</id>
                <fee>$989.00</fee>
            </contract>
        </contracts>
    </client>
    <client>
        <name>
            <lastname>McCoy</lastname>
            <firstname>Leonard</firstname>
        </name>
        <contractDate>September 7, 2092</contractDate>
        <contracts>
            <contract>
                <app>Stocker</app>
                <id>333</id>
                <fee>$2995.00</fee>
            </contract>
            <contract>
                <app>Dialer</app>
                <id>444</id>
                <fee>$200.00</fee>
            </contract>
        </contracts>
    </client>
    <client>
        <name>
            <lastname>Spock</lastname>
            <firstname>Mr.</firstname>
        </name>
        <contractDate>September 9, 2092</contractDate>
        <contracts>
            <contract>
                <app>WinHook</app>
                <id>555</id>
                <fee>$129.00</fee>
            </contract>
            <contract>
                <app>MouseApp</app>
                <id>666</id>
                <fee>$25.00</fee>
            </contract>
        </contracts>
    </client>
</document>

You can use JavaScript to strip out the data you want from documents like this. For example, if you're interested in the last names of your clients, you might want to catch all <lastname> elements. When you catch each element, you could set a Boolean flag to true to indicate that you want to catch the following text node, which holds the last name:

if(currentNode.nodeName == "lastname") {
    catchNext = true
}

Then you would loop over all child nodes of the present node:

if (currentNode.childNodes.length > 0) {
    for (var loopIndex = 0; loopIndex <
        currentNode.childNodes.length; loopIndex++) {
        text += childLoop(currentNode.childNodes(loopIndex), catchNext)
    }
}

If catchNext was true when dealing with a child node, you would know that you were dealing with a text node whose text you need, so you could save that text this way:

if(catchNext) {
    text = currentNode.nodeValue + "<BR>"
    catchNext = false
}

Here's what the whole HTML page, including the needed JavaScript, looks like (in this case, we've named the XML document we're working with projects.xml):

<HTML>
    <HEAD>
        <TITLE>
            Getting the last names
        </TITLE>

        <SCRIPT LANGUAGE="JavaScript">

            function readXMLData()
            {
                xmlDocumentObject = new ActiveXObject("Microsoft.XMLDOM")
                xmlDocumentObject.load("projects.xml")

                displayDIV.innerHTML = childLoop(xmlDocumentObject, false)
            }

            function childLoop(currentNode, catchNext)
            {
                var text = ""

                    if(catchNext) {
                        text = currentNode.nodeValue + "<BR>"
                        catchNext = false
                    }

                    if(currentNode.nodeName == "lastname") {
                        catchNext = true
                    }

                if (currentNode.childNodes.length > 0) {
                    for (var loopIndex = 0; loopIndex <
                        currentNode.childNodes.length; loopIndex++) {
                        text += childLoop(currentNode.childNodes(loopIndex),
                            catchNext)
                    }
                }
                return text
            }

        </SCRIPT>
    </HEAD>

    <BODY>
        <H1>
            Getting the last names
        </H1>

        <INPUT TYPE="BUTTON" VALUE="Get last names"
            onClick = "readXMLData()">
        <DIV ID="displayDIV"></DIV>
    </BODY>
</HTML>

This example displays the last names of your clients in a Web page, like this:

Kirk
McCoy
Spock

In Part IV we also looked at how to use Java with the XML DOM. There's an immense amount of support for XML DOM handling in Java 1.4 and later. You can use a Java DocumentBuilderFactory object to create a DocumentBuilder object, and you can use the DocumentBuilder object's parse method to parse an XML document and create a Java Document object.

The Document object corresponds to the top node of the document tree. You can move from node to node by using methods such as getChildNodes. You can check the type of a node by using the getNodeType method, a node's name by using the getNodeName method, and a node's value by using the getNodeValue method. And you can get an element's attribute nodes by using the getAttributes method.

For instance, here's what the JavaScript example we just saw looks like converted into Java—the logic is the same, but this time, the implementation is in Java:

import javax.xml.parsers.*;
import org.w3c.dom.*;

public class t
{
    static String displayText[] = new String[1000];
    static int numberLines = 0;

    public static void main(String args[])
    {
        try {
            DocumentBuilderFactory factory =
                DocumentBuilderFactory.newInstance();

            DocumentBuilder builder = null;
            try {
                builder = factory.newDocumentBuilder();
            }
            catch (ParserConfigurationException e) {}

            Document document = null;
            document = builder.parse(args[0]);

            childLoop(document, false);

        } catch (Exception e) {
            e.printStackTrace(System.err);
        }

        for(int loopIndex = 0; loopIndex < numberLines; loopIndex++){
            System.out.println(displayText[loopIndex]);
        }
    }

    public static void childLoop(Node node, boolean catchNext)
    {
        if (node == null) {
            return;
        }

        int type = node.getNodeType();

        switch (type) {

            case Node.DOCUMENT_NODE: {
                childLoop(((Document)node).getDocumentElement(), false);
                break;
             }

             case Node.ELEMENT_NODE: {
                 if(node.getNodeName().equals("lastname")) {
                     catchNext = true;
                 }

                 NodeList childNodes = node.getChildNodes();
                 if (childNodes != null) {
                     int length = childNodes.getLength();
                     for (int loopIndex = 0; loopIndex < length;
                         loopIndex++ ) {
                        childLoop(childNodes.item(loopIndex), catchNext);
                     }
                 }
                 break;
             }

             case Node.TEXT_NODE: {
                 if(catchNext){
                     String trimmedText = node.getNodeValue().trim();
                     if(trimmedText.indexOf("\n") < 0 && trimmedText.length()
                         > 0) {
                         displayText[numberLines] = trimmedText;
                         numberLines++;
                     }
                     catchNext = false;
                 }
                 break;
             }
        }
    }
}

This application gives you the same result as the previous example:

Kirk
McCoy
Spock

By using the DOM and Java, you can also search for specific elements by using the getElementsByTagName method or move through an XML document by using methods such as getNextSibling, getPreviousSibling, getFirstChild, getLastChild, and getParent. You can even edit the contents of an XML document by using methods such as appendChild, insertBefore, removeChild, and replaceChild.

Besides using the DOM in Java, you can also work with SAX to parse XML documents. A SAX parser is event driven—that is, it parses an XML document and calls code when it find the beginning of a document, the start of an element, a text node, and so on.

When you register your code with a SAX handler and parse a document, the startElement method is called when the beginning of an element is encountered, the characters method is called when a text node is encountered, the processingInstruction method is called when a processing method is encountered, the endElement method is called when the end of an element is encountered, and so forth. These SAX methods are called with the data you need from the document you're parsing.

We ended Part IV with a look at two important XML applications: SOAP and RDF. SOAP enables applications to communicate by working with objects across programming boundaries. A SOAP message is made up of three parts: an envelope that contains the message, an optional header that holds data about the message, and a body that holds the actual message. SOAP messages can also have attachments, and we took a look at an example of a SOAP message that did.

RDF lets you describe resources. In theory, RDF can be used to describe any resource that you can describe in words. However, it's used mostly to describe Web resources. RDF gives search engines easy and uniform access to information on Web resources. RDF is not widely implemented today yet, but it's gaining ground.

There are usually three parts to an RDF statement: the resource itself, which you point to with a URI, a name that shows what property of the resource you want to describe, and the description itself.

That's it for Part IV. You have a great deal of power when you write programming code to work with XML. Although working with XSLT and CSS to handle XML is fine up to a point, to really get into your data, extract what you want, and process it, you need to write your own code. And now that you have the fundamentals down and have seen examples, it's not all that difficult. In Part V you're going to work with another popular XML topic—using XML and databases.

< Back Page 234 of 288 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Email Address

Sams Teach Yourself XML in 21 Days

Recommended Book

Recommended Book

Recommended Book 

Part IV. In Review

InformIT Promotional Mailings & Special Offers