- Defining the Document Object Model
- DOM Core Level I
- Creating Document Objects
- Node Interface
- NodeList and NamedNodeMap
- Document Interface
- Element Interface
- Attr Interface
- Additional Interfaces
- Creating DOM Elements
- DOM Level II
- The DOM Core Defined
- Implementation Anomalies
- Summary
- Suggested for Further Study
- Further Reading
Implementation Anomalies
The following section details findings discovered while using each of the aforementioned DOM implementations. Each implementation has its own warts and idiosyncrasies. It should be kept in mind that the implementations are in various forms of compliance with the specifications. Any and all of these issues may be addressed with newer releases.
Processing Instructions
<?xml version="1.0" encoding="UTF-8" ?>
Only the Oracle implementation returned anything for this line. It was returned as a ProcessingInstruction node with appropriate contents.
Unexpected Child Nodes
The IBM implementation returned a number of child nodes off of the DOCUMENT_TYPE_NODE object. Both the Sun and Oracle implementations return 0 children for this node. The IBM implementation listed the entities of the XML as children of this node as well as a number of other nodes that appear to represent the structure of the DTD.
Results Using toString
Many Java developers, myself included, use the toString method to examine object contents. Various results were obtained by using this method on different objects in the DOM hierarchy. It is strongly recommended that you not depend on the results of this method because the DOM Core does not specify what it should return. With that said, the following results were observed.
node.getAttributes().toString returned differing results.
Sun returned 'discount="wholesale" cur="us"'.
IBM returned '[retail, us]'.
Oracle returned what appears to be the underlying result of toStringing the actual attribute objects.
CR/LF in XML Document Text
One of the requirements of the DOM is that it reports structurally isomorphic results. That is to say that two documents are identical from a processing perspective if formatting that makes no structural difference, such as whitespace outside real content, is not considered. In English, that means what goes in should come out. However, different implementations handle Carriage Return/Line Feed pairs differently. Specifically, the CR/LF pair between lines in the XML document is discarded by the Sun parser but returned as a text node by the IBM parser. The Oracle implementation returned CR/LF pairs where expected.
Comments
Comments are another area where implementations differed significantly.
SunLost comments
IBMShown in appropriate places as comment nodes
OracleShown in appropriate places as comment nodes
Entities
Because the DOM Core allows for validating and non-validating parsers, entities can be expected to be handled slightly differently between implementations. The following results were observed.
SunEntities returned but values shown always as null
IBMEntity class cast exception when casting to entity
OracleReturned as expected
As we can see, there are differences between the implementations. However, we can assume that, as of this writing, all the implementations are beta and many of these issues will be addressed.