- Defining the Document Object Model
- DOM Core Level I
- Creating Document Objects
- Node Interface
- NodeList and NamedNodeMap
- Document Interface
- Element Interface
- Attr Interface
- Additional Interfaces
- Creating DOM Elements
- DOM Level II
- The DOM Core Defined
- Implementation Anomalies
- Summary
- Suggested for Further Study
- Further Reading
Node Interface
The root of the inheritance structure for DOM objects is the Node. All other interfaces are descended from Node and inherit a number of its methods. The Node interface provides three basic areas of functionality that allow the developer to do the following:
Find information about the Node, such as its value, name, and type.
Read, update, and delete information.
Find children, parent, and sibling Node information.
What we consider the tips of an XML document tree are typically of the following node types: TEXT_NODE, CDATA_SECTION_NODE, and COMMENT_NODE. Most of the other nodes are normally found as special purpose children of a document, element, or tip node.
As we saw in Figure 3.4, nodes represent the branches, sub-branches, and leaf nodes of an XML tree. At first glance, these nodes appear to be alike. The key to unraveling this mystery is the getNodeType() method. The getNodeType() method returns the underlying node's numeric type. With getNodeType, we can now walk the DOM tree casting nodes, where appropriate, to other interfaces.
If we look closely at any given DOM tree, we see that branches and sub-branches are all of type ELEMENT_NODE, with the various leaf nodes being of type TEXT_NODE, CDATA_NODE, COMMENT_NODE, or one of the other terminal node types. Under normal circumstances, every node on the DOM hierarchy returns one of the node types listed in Table 3.1. Perhaps the most important of the methods in the Node interface are the accessor methods for returning a node's children. The original DOM specification named a number of methods for accessing an object's children, with the final recommendation supporting four important methods (note that the DOM Level II specification provides several more). We will now examine the Node interface in general.
Note
The DOM is specified using CORBA IDL. Unfortunately, both CORBA and XML use the term attribute. In the CORBA case, an attribute is really a member variable within an IDL definition of an interface. In the XML case, an attribute is a property of a data item. For example, the currency of a monetary value is a property of a data item. The section "The DOM Core Defined" will discuss IDL in more detail. In cases where it is not clear from the context, we will state specifically which we are discussing.
In addition, when a CORBA Interface Definition Language (IDL) file is processed via an IDL compiler, many of the attributes result in mutator (get/set) methods being generated. While these methods are generated, they are methods nonetheless, and we will discuss them interchangeably with developer-written methods.
The Node interface is the primary interface in the DOM Core. It contains 12 member variables (CORBA attributes) and 6 methods (although most of the CORBA attributes result in one or two methods being generated). I've broken down the methods into groups based on properties of the object.
Methods that Return Information About a Node
Each of the following descriptions reference the following XML snippet:
<entry> <title>Better Living Thru Chemistry</title> <author>I. W. Books</author> <publisher>Books R Us</publisher> <price discount="retail" cur="us">9.95</price> <price discount="wholesale" cur="us">7.95</price> <isbn>0101010123</isbn> </entry>
short getNodeType() Returns the underlying type of the node as defined in org.w3c.dom.Node.java. See Table 3.1 for a complete listing. getNodeType() on <isbn>...</isbn> would return ELEMENT_NODE with one child of type TEXT_NODE.
String getNodeName() Returns the name of the node. For example, cattrailer.
String getNodeValue(), void setNodeValue(String) Returns the value of the node or sets the value of the node. Returns null if not applicable. Again, on <author>, returns null but on child of author it would return I.W. Books.
NamedNodeList getAttributes() Returns a NamedNodeList of the attributes associated with this node or null if no attributes exist. We will examine the NamedNodeList class shortly. getAttributes on <price discount="wholesale" cur="us">7.95</price> would return a list of two attributes.
boolean hasChildNodes() Returns true or false depending on whether this node has children.
Methods that Return Information About the Children of a Node
There are a number of methods that can be used to return information about the children of a Node object. The most commonly used methods are
Node getFirstChild() Returns null or the first child of the node. This method is useful when we are looking for the actual data of a node. On <entry>...</entry>, would return an ELEMENT_NODE representing <title>...</title>.
Node getLastChild() Returns null or the last child of the node. On <entry>...</entry>, would return an ELEMENT_NODE representing <isbn>...</isbn>.
NodeList getChildNodes() Returns a NodeList of the children of the current node. Depending on the node type, this list may be empty, but the specification requires it to be returned. Readers should also note that this is a "live" list. That is, it can change and mutate as the underlying object changes and is required to do so. If the XML document changes, for example to add a new child to a given node, any NodeList that represents that child must be updated to reflect the change. getChildNodes on entry returns a NodeList of ELEMENT_NODEs representing the 5 children.
Methods Related to the Parent or Siblings
Each of the following descriptions reference the following XML snippet. Note that given our DTD, this snippet is well-formed but not technically valid as no DTD is given.
<catalog> <entry> <title>Better Living Thru Chemistry</title> </entry> <entry> <title>Conversational French</title> </entry> <entry> <title>Special Edition:Using Java 2 and XML</title> </entry> </catalog>
Node getParentNode() Returns the parent of this node or null in the case of Document, DocumentFragment, or Attribute objects. In addition, nodes that have been created on-the-fly but not inserted into a document may return null. The parent of any <entry...> node is <catalog>.
Node getPreviousSibling() Returns the immediate sibling to the current node or null if the first or no prior sibling exists. The previous sibling of Conversational French is Better Living.
Node getNextSibling() Same as previous only for next sibling.
Methods that Return Information About the Document a Node Is Contained Within
While there is only a single method for returning the DOM document a node is contained within, it is important to be able to access the parent Document from any given Node.
Document getOwnerDocument() Returns null for documents or the Document this Node is contained within.
Methods for Manipulating the Children of a Node
There are a number of methods for manipulating the contents of a node.
Node insertBefore(Node new, Node reference) Inserts a node before reference node returns the inserted node.
Node replaceNode(Node replacement, Node tobereplaced) Replaces the specified node with the replacement node. Returns the replaced node.
Node removeNode(Node toberemoved) Removes the input node. Returns the removed node.
Node appendChild(Node tobeappended) Adds a child node to the list of child nodes of the current node. Returns the node appended.
Node cloneNode(boolean deep) Produces a duplicate of the node. If deep=true, all the children of this node are also duplicated recursively. Note that all attributes of the node are copied. If deep=false, the node's children are not copied.