XML Reference Guide

Mar 14, 2003

␡

⎙ Print

< Back Page 115 of 278 Next >

It is a truism that any significantly large application almost inevitably has a database of some sort behind it, so a large part of working with XML involves working with data that comes from databases. Sometimes this involves complex programming. Sometimes this involves middleware. In the case of Microsoft SQL Server 2000, it sometimes involves carefully crafted SQL SELECT statements.

As you may know, Microsoft has been focusing heavily on XML in its latest products, so it's no surprise that SQL Server 2000 comes "XML ready". In fact, there are several ways to directly access XML data from SQL Server, including SELECT statements, Transact-SQL's OPENXML, and template files. In this reference guide section, I'm going to focus on using SELECT statements with the FOR XML clause.

Before we do anything, I should say a few words about how you can use these SELECT statements. I'm going to use HTTP access as an example, but you can also run these statements directly using SQL Server's Query analyzer. You can also build them into your applicatons.

To get started, make sure that you have the following installed:

Microsoft SQL Server 2000 SP 3
IIS 5.0
SQLXML 3.0

Follow the directions in the SQLXML documentation (look for "Creating the nwind Virtual Directory") to set up the environment for running SQL queries over HTTP. For the purpose of this discussion, I'll assume that you're using the obligatory Northwind database.

Once that's done, open the browser and you're ready.

What you've done is to create a virtual directory that runs SQL queries against the database and returns the data to the browser. For example, point your browser to the following URL:

http://yourserver/nwind?sql=SELECT+CustomerID,+ContactName+FROM+Customers+
FOR+XML+RAW&root=ROOT

(I've split it onto two lines just to fit it on the page. It should be one line in the browser.)

Let's take this one piece at a time. The SQL query we're executing is

SELECT CustomerID, ContactName FROM Customers FOR XML RAW

If we were to execute it in Query Analyzer, we'd get the following XML:

<row CustomerID="ALFKI" ContactName="Maria Anders" /> 
<row CustomerID="ANATR" ContactName="Ana Trujillo" /> 
<row CustomerID="ANTON" ContactName="Antonio Moreno" /> 
<row CustomerID="AROUT" ContactName="Thomas Hardy" />

The trouble with this result is that because it lacks a root element, it's not a well-formed XML document, so the browser can't render it. By adding

root=ROOT

we turn the result into

<ROOT>
   <row CustomerID="ALFKI" ContactName="Maria Anders" /> 
   <row CustomerID="ANATR" ContactName="Ana Trujillo" /> 
   <row CustomerID="ANTON" ContactName="Antonio Moreno" /> 
   <row CustomerID="AROUT" ContactName="Thomas Hardy" /> 
</ROOT>

The choice of ROOT is completely arbitrary. You can use any element name you want.

Now let's look at the query itself. Because I used FOR XML RAW, the data is returned with each row in the result set encoded as a single row element, with each column specified as an attribute. The advantage of this behavior is that it's completely predictable. The disadvantage is that it's not particularly human-friendly when you start to get more than one result set. Also, FOR XML RAW doesn't support multiple-table queries.

Then there's FOR XML AUTO. Let's stay, for example, that we executed the same query, but as

http://yourserver/nwind?sql=SELECT+CustomerID,+ContactName+FROM+Customers+
FOR+XML+AUTO&root=ROOT

At first glance, the result is very similar:

<ROOT> 
   <Customers CustomerID="ALFKI" ContactName="Maria Anders" /> 
   <Customers CustomerID="ANATR" ContactName="Ana Trujillo" /> 
   <Customers CustomerID="ANTON" ContactName="Antonio Moreno" /> 
   <Customers CustomerID="AROUT" ContactName="Thomas Hardy" /> 
</ROOT>

The major obvious difference is that rows are now identified by the table name, which is certainly an improvement. We can also start using multiple table queries, or joins, as in:

http://yourserver/nwind?sql=SELECT+Customers.CustomerID,OrderID,
OrderDate+FROM+Customers,+Orders+WHERE+Customers.CustomerID=Orders.CustomerID+
Order+by+Customers.CustomerID,OrderID+FOR+XML+AUTO&root=Data

This executes the query

SELECT Customers.CustomerID, OrderID, OrderDate 
    FROM Customers, Orders 
    WHERE Customers.CustomerID = Orders.CustomerID 
    ORDER by Customers.CustomerID, OrderID 
    FOR XML AUTO

which gives us something along the lines of

<Data> 
   <Customers CustomerID="ALFKI"> 
      <Orders OrderID="10643" OrderDate="1997-08-25T00:00:00" /> 
   </Customers> 
   <Customers CustomerID="ANATR"> 
      <Orders OrderID="10308" OrderDate="1996-09-18T00:00:00" /> 
      <Orders OrderID="10625" OrderDate="1997-08-08T00:00:00" /> 
   </Customers> 
</Data>

Notice that the data is grouped based on the join condition.

Here we also see the data as attributes, but you can also tell SQL Server to structure it as elements, with a query of

SELECT Customers.CustomerID, OrderID, OrderDate 
    FROM Customers, Orders 
    WHERE Customers.CustomerID = Orders.CustomerID 
    ORDER by Customers.CustomerID, OrderID 
    FOR XML AUTO, ELEMENTS

giving a result such as

<Data> 
   <Customers> 
      <CustomerID>ALFKI</CustomerID> 
      <Orders> 
         <OrderID>10643</OrderID> 
         <OrderDate>1997-08-25T00:00:00</OrderDate> 
      </Orders> 
   </Customers> 
   <Customers> 
      <CustomerID>ANATR</CustomerID> 
      <Orders> 
         <OrderID>10308</OrderID> 
         <OrderDate>1996-09-18T00:00:00</OrderDate> 
      </Orders> 
      <Orders> 
         <OrderID>10625</OrderID> 
         <OrderDate>1997-08-08T00:00:00</OrderDate> 
      </Orders> 
   </Customers>
</Data>

You can also control the names of the various elements (or attributes) using aliases within the SQL statement, as in

http://yourserver/nwind?sql=SELECT+customer.CustomerID+as+custid,
OrderID+as+oid,OrderDate+as+odate+FROM+Customers+as+customer,+
Orders+as+ord+WHERE+customer.CustomerID=ord.CustomerID+
Order+by+customer.CustomerID,OrderID+FOR+XML+AUTO,ELEMENTS&root=customers

which executes the query

SELECT customer.CustomerID as custid, OrderID as oid, OrderDate as odate 
   FROM Customers as customer, Orders as ord 
   WHERE customer.CustomerID = ord.CustomerID 
   ORDER by customer.CustomerID, OrderID 
   FOR XML AUTO,ELEMENTS

The result is something like

<customers> 
   <customer> 
      <custid>ALFKI</custid> 
      <ord> 
         <oid>10643</oid> 
         <odate>1997-08-25T00:00:00</odate> 
      </ord> 
   </customer> 
   <customer> 
      <custid>ANATR</custid> 
         <ord> 
            <oid>10308</oid> 
            <odate>1996-09-18T00:00:00</odate> 
         </ord> 
         <ord> 
            <oid>10625</oid> 
            <odate>1997-08-08T00:00:00</odate> 
         </ord> 
   </customer>
</customers>

This does give us some degree of control, but we're still at the mercy of the "all-or-nothing" nature of FOR XML AUTO. We've got all attributes, or all elements, and nothing in between. Fortunately, we have one more option.

The FOR XML EXPLICIT clause enables us to directly control the structure of the resulting XML. Joining tables requires you to jump through a few hoops (and UNION statements) but consider, for example, the URL:

http://yourserver/nwind?sql=SELECT+1+as+Tag,+NULL+as+Parent,+
CustomerID+as+[customer!1!custid],+Phone+as+[customer!1!phone],+
CompanyName+as+[customer!1!company!element],+
ContactName+as+[customer!1!name!element],+
ContactTitle+as+[customer!1!title!element]+FROM+Customers+
ORDER+by+[customer!1!custid]+FOR+XML+EXPLICIT&root=customers

and its associated query:

SELECT 1 as Tag, NULL as Parent, 
       CustomerID as [customer!1!custid], 
       Phone as [customer!1!phone], 
       CompanyName as [customer!1!company!element], 
       ContactName as [customer!1!name!element], 
       ContactTitle as [customer!1!title!element] 
   FROM Customers 
   ORDER by [customer!1!custid] 
   FOR XML EXPLICIT

This special format enables you to specify the name of the element for which each element is a child, the level of that element, and whether it's an attribute or element. (There's actually much more you can specify, but we'll keep it at that level for this discussion.)

For example, we've specified that level 1 is a Tag, and the CustomerID column should be represented as the custid attribute of that level 1 tag, which is named customer. Similarly, the CompanyName column is a company element underneath that level 1 customer element. The result looks something like this:

<customers> 
   <customer custid="ALFKI" phone="030-0074321"> 
      <company>Alfreds Futterkiste</company> 
      <name>Maria Anders</name> 
      <title>Sales Representative</title> 
   </customer> 
   <customer custid="ANATR" phone="(5) 555-4729"> 
      <company>Ana Trujillo Emparedados y helados</company> 
      <name>Ana Trujillo</name> 
      <title>Owner</title> 
   </customer>
</customers>

This discussion only scratches the surface of what you can do with SQL Server and XML. In future sections, we'll talk about using XSL Transformations, template files, mapping schemas, and even selecting SQL data using XPath.

< Back Page 115 of 278 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Email Address