- Defining the Document Object Model
- DOM Core Level I
- Creating Document Objects
- Node Interface
- NodeList and NamedNodeMap
- Document Interface
- Element Interface
- Attr Interface
- Additional Interfaces
- Creating DOM Elements
- DOM Level II
- The DOM Core Defined
- Implementation Anomalies
- Summary
- Suggested for Further Study
- Further Reading
Additional Interfaces
There are nine additional interfaces that we will not examine in low-level detail but will be required for many applications. The first four have to do with actual content:
CharacterData The CharacterData interface extends Node. No DOM objects of type CharacterData exist but rather this object represents common methods inherited by other DOM interfaces. The CharacterData interface contains eight methods for setting, getting, inserting, appending, and otherwise manipulating normal character data information.
Comment The Comment interface extends CharacterData and represents a comment in an XML document. No additional methods are defined on this interface.
Text The Text interface extends CharacterData. The Text interface contains a single method splitText(int offset) that allows the given text node to be split at an offset into two text nodes which may later be recombined with normalize();.
CDATASection The CDATASection interface extends Text and allows for blocks of text to be contained within a DOM node that would otherwise be considered markup without the need to escape markup characters. No additional methods are defined in this interface.
The next five interfaces represent other elements in XML.
Entity Extends Node. Represents the Entities within an XML document. Depending on validation, these entities may or may not be expanded and the object may (not expanded, non-validating parsers) or may not (expanded) have children. Defines three methods for accessing entity information.
Notation Extends Node. Represents the Notation elements of an XML document. Defines two methods for accessing information about a Notation.
DocumentType Extends Node. The DocumentType interface is used to gather information about the Document itself and has three methods for accessing Entities, Notations, and the name of the DTD itself.
ProcessingInstruction Extends Node. The ProcessingInstruction interface represents processing instruction information. This interface specifies three methods for accessing and manipulating processing instruction information.
DocumentFragment Extends Node. Contains no additional methods beyond the Node interface. The DocumentFragment interface is designed to allow users and developers to develop XML documents that are not well formed. A document fragment can be built that represents data at any level in an XML tree. After being inserted back into a Document object, the underlying children are inserted and not the DocumentFragment object itself. For these reasons, DocumentFragments can be thought of as lightweight Documents.
Listing 3.7, DumpXMLOracle.java, uses the Oracle DOM and wraps up all of the DOM Core Level I interfaces into one neat package that displays, in semi human-readable format, any given XML document. The output is shown in Listing 3.8.
Listing 3.7 DumpXMLOracle.java
1: /* 2: * @(#)DumpXMLOracle.java 1.0 99/05/28 3: * 4: * Copyright (c) 1999 Sams Publishing. All Rights Reserved. 5: * 6: */ 7: package sams.chp3; 8: 9: import java.io.*; 10: import java.net.*; 11: 12: import oracle.xml.parser.XMLParser; 13: 14: import org.w3c.dom.*; 15: 16: public class DumpXMLOracle 17: { 18: // map the type to a string 19: static String mapNodeTypeToString(short nodeType) 20: { 21: switch(nodeType) 22: { 23: case org.w3c.dom.Node.ATTRIBUTE_NODE: return "ATTRIBUTE_NODE"; 24: case org.w3c.dom.Node.CDATA_SECTION_NODE:return "CDATA_SECTION_NODE"; 25: case org.w3c.dom.Node.COMMENT_NODE: return "COMMENT_NODE"; 26: case org.w3c.dom.Node.DOCUMENT_FRAGMENT_NODE: return "DOCUMENT_FRAGMENT_NODE"; 27: case org.w3c.dom.Node.DOCUMENT_NODE: return "DOCUMENT_NODE"; 28: case org.w3c.dom.Node.DOCUMENT_TYPE_NODE: return "DOCUMENT_TYPE_NODE"; 29: case org.w3c.dom.Node.ELEMENT_NODE: return "ELEMENT_NODE"; 30: case org.w3c.dom.Node.ENTITY_NODE: return "ENTITY_NODE"; 31: case org.w3c.dom.Node.ENTITY_REFERENCE_NODE: return "ENTITY_REFERENCE_NODE"; 32: case org.w3c.dom.Node.NOTATION_NODE: return "NOTATION_NODE"; 33: case org.w3c.dom.Node.PROCESSING_INSTRUCTION_NODE: return "PROCESSING_INSTRUCTION_NODE"; 34: case org.w3c.dom.Node.TEXT_NODE: return "TEXT_NODE"; 35: 36: } 37: return "Unknown"; 38: } 39: 40: // Display attribute information 41: static void displayAttributeInfo(String prefix, Node node) 42: { 43: // only elements have attributes 44: if (node.getNodeType() != org.w3c.dom.Node.ELEMENT_NODE) return; 45: 46: NamedNodeMap attributes = node.getAttributes(); 47: if ( null == attributes || attributes.getLength() == 0) 48: { 49: return; 50: } 51: 52: System.out.println(prefix +"has " + attributes.getLength() + " attributes"); 53: System.out.println(prefix + attributes.toString()); 54: for (int i = 0; i < attributes.getLength(); i++) 55: { 56: Node attribute = attributes.item(i); 57: System.out.print(prefix+"["+i+"] " + attribute.getNodeName()); 58: System.out.println(" = " + attribute.getNodeValue()); 59: } 60: 61: } 62: 63: // Display generalized node properties 64: static void displayNodeInfo(String prefix,Node node) 65: { 66: System.out.println(prefix+ "----------------"); 67: System.out.println(prefix + "name:"+node.getNodeName()); 68: System.out.println(prefix + "type:("+node.getNodeType()+ ")" +mapNodeTypeToString(node.getNodeType())); 69: System.out.println(prefix + "value:"+ node.getNodeValue()); 70: displayAttributeInfo(prefix,node); 71: if (node.getNodeType() != org.w3c.dom.Node.TEXT_NODE) 72: { 73: NodeList children = node.getChildNodes(); 74: System.out.println(prefix + "Children("+ children.getLength()+"):"); 75: if ( children.getLength() > 0) 76: { 77: System.out.print(prefix+ " "); 78: for (int i = 0; i < children.getLength(); i++) 79: { 80: Node child = children.item(i); 81: System.out.print(" ["+i+"] " + child.getNodeName()); 82: } 83: System.out.println(); 84: } 85: 86: } 87: } 88: // Display Entity Information 89: static void displayEntityInfo(String prefix, Entity entity) 90: { 91: System.out.println(prefix + "Entity information"); 92: System.out.println(prefix + " public id:"+ entity.getPublicId()); 93: System.out.println(prefix + " system id:"+ entity.getSystemId()); 94: System.out.println(prefix + " notation name:"+ entity.getNotationName()); 95: displayNodeInfo(prefix,entity); 96: 97: NodeList children = entity.getChildNodes(); 98: if ( children.getLength() == 0) 99: System.out.println(prefix + " Has 0 children"); 100: else 101: System.out.println(prefix + " Children(" + children.getLength()+ ")"); 102: 103: for (int i = 0; i < children.getLength(); i++) 104: { 105: Node child = (Entity)children.item(i); 106: System.out.println(" child(" + i + ")"); 107: displayNodeInfo(" ",child); 108: } 109: 110: } 111: // Display Document information 112: static void displayDocumentInfo(Document document) 113: { 114: DocumentType docTypeInfo = document.getDoctype(); 115: System.out.println(" "); 116: System.out.println("Document Type Information"); 117: System.out.println("----------------"); 118: System.out.println("name:"+document.getNodeName()); 119: System.out.println("type:("+document.getNodeType()+ ")" +mapNodeTypeToString(document.getNodeType())); 120: System.out.println("value:"+ document.getNodeValue()); 121: System.out.println(" Properties"); 122: System.out.println(" Name Property:"+docTypeInfo.getName()); 123: NamedNodeMap entities = docTypeInfo.getEntities(); 124: System.out.println(" contains " + entities.getLength() + " entities"); 125: NamedNodeMap notations = docTypeInfo.getNotations(); 126: if ( notations != null) 127: System.out.println(" contains " + notations.getLength() + " notations"); 128: else 129: System.out.println(" contains (null) notations"); 130: 131: if ( entities != null && entities.getLength() > 0) 132: { 133: System.out.println(" Entities"); 134: for (int i = 0; i < entities.getLength(); i++) 135: { 136: // 137: // Note that in the SUN implementation this works as expected 138: // The IBM implementation causes a class cast exception. 139: try 140: { 141: Entity entity = (Entity)entities.item(i); 142: System.out.println(" Entity(" + i + ")"); 143: displayEntityInfo(" ",entity); 144: } 145: catch (Exception e) { System.out.println("exception! " + e); } ; 146: } 147: } 148: 149: if ( notations != null && notations.getLength() > 0) 150: { 151: System.out.println(" Notations"); 152: for (int i = 0; i < notations.getLength(); i++) 153: { 154: Node node = notations.item(i); 155: System.out.println(" Notation(" + i + ")"); 156: displayNodeInfo(" ",node); 157: } 158: } 159: 160: 161: } 162: 163: // Process all the children of a node. 164: public static void displayChildren(String prefix,Node parent) 165: { 166: NodeList children = parent.getChildNodes(); 167: if ( children == null || children.getLength() == 0) return; 168: for (int i = 0; i < children.getLength(); i++) 169: { 170: try 171: { 172: Node node = children.item(i); 173: displayNodeInfo(prefix,node); 174: displayChildren(prefix + " ",node); 175: } 176: catch (Exception e) 177: { 178: } 179: } 180: 181: 182: } 183: 184: 185: public static void main (String argv []) 186: { 187: if (argv.length != 1) 188: { 189: 190: System.err.println( "Usage: java sams.chp3.DumpXMLOracle filename"); 191: System.exit(1); 192: } 193: 194: try 195: { 196: XMLParser parser = new XMLParser(); 197: FileInputStream inStream = new FileInputStream(argv[0]); 198: parser.setErrorStream(System.err); 199: parser.setValidationMode(true); 200: parser.showWarnings(true); 201: 202: parser.parse(inStream); 203: 204: Document document = parser.getDocument(); 205: System.out.println("Sucessfully created document on " + argv[0]); 206: 207: // 208: // Print relevent info about the document type 209: // 210: displayDocumentInfo(document); 211: displayNodeInfo("",document); 212: 213: // 214: // Now walk the document itself displaying data 215: // 216: displayChildren(" ",document); 217: } 218: catch (Exception e) 219: { 220: System.out.println("Unexpected exception reading document!" +e); 221: System.out.println(e); 222: System.exit (0); 223: } 224: 225: 226: }
Listing 3.8 Abbreviated Output of DumpXMLOracle.java
1: C:\java sams.chp3.DumpXMLOracle 2: Sucessfully created document on catalog.xml 3: 4: Document Type Information 5: ---------------- 6: name:#document 7: type:(9)DOCUMENT_NODE 8: value:null 9: Properties 10: Name Property:catalog 11: contains 2 entities 12: contains (null) notations 13: Entities 14: Entity(0) 15: Entity information 16: public id:null 17: system id:null 18: notation name:null 19: ---------------- 20: name:PublisherInfo 21: type:(6)ENTITY_NODE 22: value:MCP 23: Children(0): 24: Has 0 children 25: Entity(1) 26: Entity information 27: public id:null 28: system id:null 29: notation name:null 30: ---------------- 31: name:AuthorName 32: type:(6)ENTITY_NODE 33: value:Albert J. Saganich Jr 34: Children(0): 35: Has 0 children 36: ---------------- 37: name:#document 38: type:(9)DOCUMENT_NODE 39: value:null 40: Children(4): 41: [0] xml [1] #comment [2] catalog [3] catalog 42: ---------------- 43: name:xml 44: type:(7)PROCESSING_INSTRUCTION_NODE 45: value: version = '1.0'encoding = 'UTF-8' 46: Children(0): 47: ---------------- 48: name:#comment 49: type:(8)COMMENT_NODE 50: value: 51: 52: A Simple catalog of books a bookstore might carry 53: 54: Al Saganich for Macmillan Computer Publishing 55: 56: Children(0): 57: ---------------- 58: name:catalog 59: type:(10)DOCUMENT_TYPE_NODE 60: value:null 61: Children(0): 62: ---------------- 63: name:catalog 64: type:(1)ELEMENT_NODE 65: value:null 66: Children(7): 67: [0] #comment [1] catheader [2] entry [3] entry [4] entry [5] entry [6] cattrailer 68: ---------------- 69: name:#comment 70: type:(8)COMMENT_NODE 71: value: 72: 73: This is a comment. 74: 75: It follows after <catalog> entry 76: 77: Children(0): 78: ---------------- 79: name:catheader 80: type:(1)ELEMENT_NODE 81: value:null 82: Children(1): 83: [0] #text 84: ---------------- 85: name:#text 86: type:(3)TEXT_NODE 87: value:This is the catalog header only one instance of this guy 88: ---------------- 89: . . . 90: name:entry 91: type:(1)ELEMENT_NODE 92: value:null 93: Children(7): 94: [0] title [1] author [2] author [3] publisher [4] price [5] price [6] isbn 95: ---------------- 96: name:title 97: type:(1)ELEMENT_NODE 98: value:null 99: Children(1): 100: [0] #text 101: ---------------- 102: name:#text 103: type:(3)TEXT_NODE 104: value:Special Edition:Using XML and Java 2.0 105: ---------------- 106: name:author 107: type:(1)ELEMENT_NODE 108: value:null 109: Children(1): 110: [0] #text 111: ---------------- 112: name:#text 113: type:(3)TEXT_NODE 114: value:Al Saganich 115: ---------------- 116: name:author 117: type:(1)ELEMENT_NODE 118: value:null 119: Children(1): 120: [0] #text 121: ---------------- 122: name:#text 123: type:(3)TEXT_NODE 124: value:Mike Daconta 125: ---------------- 126: name:publisher 127: type:(1)ELEMENT_NODE 128: value:null 129: Children(1): 130: [0] #text 131: ---------------- 132: name:#text 133: type:(3)TEXT_NODE 134: value:Sams Publishing 135: ---------------- 136: name:price 137: type:(1)ELEMENT_NODE 138: value:null 139: has 2 attributes 140: [oracle.xml.parser.XMLAttr@29be915b, oracle.xml.parser. XMLAttr@2b06915b] 141: [0] discount = retail 142: [1] cur = us 143: Children(1): 144: [0] #text 145: ---------------- 146: name:#text 147: type:(3)TEXT_NODE 148: value:9.95 149: ---------------- 150: name:price 151: type:(1)ELEMENT_NODE 152: value:null 153: has 2 attributes 154: [oracle.xml.parser.XMLAttr@28ae915b, oracle.xml.parser. XMLAttr@2872915b] 155: [0] discount = wholesale 156: [1] cur = us 157: Children(1): 158: [0] #text 159: ---------------- 160: name:#text 161: type:(3)TEXT_NODE 162: value:7.95 163: ---------------- 164: name:isbn 165: type:(1)ELEMENT_NODE 166: value:null 167: Children(1): 168: [0] #text 169: ---------------- 170: name:#text 171: type:(3)TEXT_NODE 172: value:0101010124 173: ---------------- 174: . . . 175: name:entry 176: type:(1)ELEMENT_NODE 177: value:null 178: Children(6): 179: [0] title [1] author [2] publisher [3] price [4] price [5] isbn 180: . . . 181: ---------------- 182: name:cattrailer 183: type:(1)ELEMENT_NODE 184: value:null 185: Children(1): 186: [0] #text 187: ---------------- 188: name:#text 189: type:(3)TEXT_NODE 190: value:This is the catalog trailer only one instance of this guy as well