Storing XML Data in a Relational Database
- Creating the Database Structure
- Inserting the Data?
- Retrieving the Data?
- Querying the Database?
- Updating the Records?
- Summary?
Unless you work in a highly rigid, well-structured environment, you’ve probably had situations in which you have to store objects with widely different descriptions (properties, attributes, fields... whatever you call them) into the same relational database table.
A very common example is an on-line department store, in which you care about the color of some products, the weight of some other products, and the sizes of a third subset of products (not to mention fabrics, expiration dates, mileage per gallon, number of pages...).
Quite commonly, ad hoc database users would define a single table with a very large number of columns, most of them being NULL most of the time.
Database designers with more experience in database normalization would quickly recognize that you need three tables to represent the same requirements (Listing 1).
- The PRODUCTS table that would hold the basic product information
- The ATTRIBUTES table that would hold the attribute names (weight, color, size, and so on)
- The ATTRIBUTEVALUES table that would link the two and specify the known attributes of a product
Listing 1 Normalized product-attribute tables
CREATE TABLE Products ( ProductID INT IDENTITY(1,1) NOT NULL PRIMARY KEY, ProductName VARCHAR(50) NOT NULL) CREATE TABLE Attributes ( AttributeID INT IDENTITY(1,1) NOT NULL PRIMARY KEY, AttributeName VARCHAR(50) NOT NULL) CREATE TABLE AttributeValues( ProductID INT, AttributeID INT, AttributeValue VARCHAR(500), CONSTRAINT [AttributeValues_Primary] PRIMARY KEY CLUSTERED (ProductID,AttributeID) ) ALTER TABLE AttributeValues ADD CONSTRAINT Values_ProductID FOREIGN KEY(ProductID) REFERENCES Products(ProductID) ON UPDATE CASCADE ON DELETE CASCADE ALTER TABLE AttributeValues ADD CONSTRAINT Attributes_ProductID FOREIGN KEY(AttributeID) REFERENCES Attributes(AttributeID) ON UPDATE CASCADE ON DELETE CASCADE
With the widespread acceptance of XML, we are commonly considering a third solution: representing rarely used object properties in an XML-formatted string that is stored in the same table as the rest of the data.
You might also view the XML data as yet another Binary Large Object (BLOB) that has to be stored in your database. For example, the Frequently Asked Questions database (described in my "Serve SQL Data in XML Format" and "Introduction to Hijax" articles) stores HTML markup in a database field. That’s a perfectly legal requirement that will not hurt you as long as you’re not trying to extract individual values from the XML data in SQL queries.
A more troublesome approach is to misuse the XML format to represent hierarchical data structure within a relational table. Don’t do it; this battle has been fought years ago and the relational model is clearly won.