7.6 Why Databases and XML?
Relational databases are good for storing data in a controlled, administered manner. They have built-in support for fast concurrent access and optimized set-based query capabilities. However, the protocol and packet formats are database-specific. XML is an almost universally supported method for passing data around. It is supported by heterogeneous architectures; for example, a big-endian Sun workstation can easily parse an XML document created on a little-endian Intel architecture machine. Given that each XML document must have a single root element, it must be somewhat hierarchical by default.
Some database servers have built-in XML features, and the underlying APIs also have built-in integration features. An ASP application can facilitate the sending and receiving of SQL results over HTTP and the formatting of output as XML, optionally adding stylesheets.
A rectangular resultset can easily be stored in XML format for transmission to any platform. Because an XML document can be a hierarchical representation of a complete graph of data, there must be a method to decompose this data when it is stored into the database. Conversely, to serve an XML document as output, it is often useful to compose information from two or more tables into a hierarchy.
7.6.1 XML as a Distinct Type
Object-relational databases and extensions to relational databases let you use an XML document as a distinct type. To do this, you use an XML DataBlade in Informix 9, or an XML Extender in DB2 6.0 and later.
When you use XML as a distinct type, you store the entire XML document as a CLOB column. Special user-defined functions, schemas, tables, and API extensions allow optimization of the XML type. For example, when a DB2 database is defined as XML Extender-aware, an XML distinct type is added at the database level. This equates to a CLOB. In addition, a series of user-defined functions (UDFs) and stored procedures is added to the database. These database objects take care of the addition and maintenance of the XML user-defined type (UDT) and keep optional tables of information (called sidetables) up-to-date when new XML column instances are added.
7.6.2 Document Composition and Decomposition
XML documents are organic types, meaning that they most closely represent a graph of objects in an ODBMS or a network DBMS. You can decompose the information contained in a document into multiple relational tables. Document decomposition can serve to reduce database round-trips because you can pass in the entire document at once and parse it into multiple relational tables.
Going the other way, when you need to present data as an XML document, composition of multiple tables is required. The easiest approach is to use extensions to SQL that know how to produce an XML hierarchy based on the individual tables in a join. Some databases provide extension functions that enable document decomposition. Special logic in stored procedures can be used to store extra data that is provided in the document but does not correspond to a specific relational table.
Document composition is often combined with services provided by APIs and XSLT stylesheets to enable direct output of XML-based HTTP pages and XML-based input formatting through Web browsers. This strategy is used by XML for SQL Server's Internet Services API (ISAPI) application and Oracle's XTK (XML toolkit).