The EXPLICIT Mode of FOR XML
- The EXPLICIT Mode of FOR XML
- More than One Child Tag
- Choosing the Columns for the ORDER BY Clause
- Parent Tag Not Open Error
- Summary
My previous article gave you a quick introduction to XML support in SQL Server 2000, and examined the details of AUTO and RAW modes of FOR XML clause. Recall from that article that neither of those modes offers you full control over your XML output. The EXPLICIT mode, on the other hand, allows you to customize your output according to your needs. At first glance, FOR XML EXPLICIT might seem extremely difficult to learn and master. I won't say that it is trivial, but it's not rocket science, either. This article will teach you how to customize your output using FOR XML EXPLICIT.
The EXPLICIT mode is implemented through UNION ALL queries. If you're not familiar with the UNION ALL clause of the SELECT statement, it simply combines the results of two or more queries. Each query combined, with the UNION ALL clause, has to contain the same number of columns. The corresponding columns in each query need to have compatible data types. In other words, you cannot UNION an integer and a string (unless you explicitly convert one of them first). For instance, I could combine the names of customer contacts and employee names in the Northwind database with the following query:
SELECT ContactName FROM customers UNION ALL SELECT FirstName + ' ' + LastName AS FullName FROM Employees
Results (abbreviated):
ContactName ------------------------------- Maria Anders Ana Trujillo Antonio Moreno Thomas Hardy Christina Berglund Hanna Moos Frédérique Citeaux Martín Sommer
Notice that even though I combined two queries with different column names, the output has a heading of the top query (ContactName). In fact, the UNION ALL clause could not care less how many queries are involved: You only get the column names from the top query.
The EXPLICIT mode works similarly; you define your XML hierarchy in the top query and then you grab data for each of the XML nodes from the queries that follow. Keep in mind though, that each query will have to contain the same number of columns with compatible data types.
The XML structure to be returned is stored in what is referred to as the "universal table." The universal table contains information about the XML tag names, as well as how the tags need to be nested. Let's look at a quick example to make things a bit clearer.
The following query returns titles written by the author Green and respective royalty percentages from the PUBS database:
SELECT 1 AS TAG, NULL AS PARENT, authors.au_fname AS [authors!1!au_fname], authors.au_lname AS [authors!1!au_lname], NULL AS [titleauthor!2!royaltyper], NULL AS [titles!3!title] FROM authors WHERE au_lname = 'green' UNION ALL SELECT 2 AS TAG, 1 AS PARENT, au_fname, au_lname, royaltyper, NULL FROM authors INNER JOIN titleauthor ON authors.au_id= titleauthor.au_id WHERE au_lname ='green' UNION ALL SELECT 3 AS TAG, 2 AS PARENT, au_fname, au_lname, royaltyper, title FROM authors INNER JOIN titleauthor ON authors.au_id = titleauthor.au_id INNER JOIN titles ON titles.title_id = titleauthor.title_id WHERE au_lname ='green' ORDER BY [authors!1!au_fname], [authors!1!au_lname], [titleauthor!2!royaltyper] FOR XML EXPLICIT
Results:
<authors au_fname="Marjorie" au_lname="green"> <titleauthor royaltyper="40"> <titles title="The Busy Executive's Database Guide"/> </titleauthor> <titleauthor royaltyper="100"> <titles title="You Can Combat Computer Stress!"/> </titleauthor> </authors>
I know that if you haven't used FOR XML EXPLICIT before, the previous query might be overwhelming. Don't worryI'll dissect this query in great detail.
As I said earlier, you have to specify the XML structure to be returned in the top query. The topmost tag in the XML hierarchy has no parent, so in XML the chicken comes before the egg. That's why every query using the EXPLICIT mode has to start with the following:
SELECT 1 AS TAG, NULL AS PARENT
The rest of the first SELECT statement constructs the XML hierarchy I want to see in the output. I'd like <authors> to be the outermost tag, followed by <titleauthor> and <titles>. The <authors> tag should contain the first and last names of the author. The <titleauthor> tag, which is the child of <authors>, should contain a single attribute of royalty percentages. Finally, <titles> tag is the child of tag number 2<titleauthor>and should contain the title name. Hence, the hierarchy defined in the topmost query looks as follows:
authors.au_fname AS [authors!1!au_fname], authors.au_lname AS [authors!1!au_lname], NULL AS [titleauthor!2!royaltyper], NULL AS [titles!3!title]
Now, things should be starting to make sense. However, what if you want to change the column names or table names in the output? The answer is simple: Just change the portion of the top query that is placed in brackets. So I could rewrite the top SELECT statement of the same query as follows:
SELECT 1 AS TAG, NULL AS PARENT, authors.au_fname AS [Author!1!first_name], authors.au_lname AS [Author!1!last_name], NULL AS [TitleAuthor!2!royalty_percentage], NULL AS [Title!3!title_name]
Then my output would look like following:
<Author first_name="Marjorie" last_name="green"> <TitleAuthor royalty_percentage="40"> <Title title_name="The Busy Executive's Database Guide"/> </TitleAuthor> <TitleAuthor royalty_percentage="100"> <Title title_name="You Can Combat Computer Stress!"/> </TitleAuthor> </Author>
Great! That part was fairly simple, but what's with those NULLs in the query? Recall that each SELECT statement participating in the UNION query needs to contain the same number of columns. Could you join all three tables in each query? Yes, you could, but your performance would suffer. The PUBS database has a handful of records, and the query I just executed returns only a couple of rows. But if you have three tables with thousands of rows in each, joining the three tables for each SELECT would make your query rather slow.
NOTE
At times, you can't help but join at least two tables in the topmost query if you have to limit the results of the top query. However, if you can get away with not mentioning all participating tables in the top SELECT statement, your performance can be much better.
Another part of FOR XML EXPLICIT syntax that might catch your attention is the ORDER BY clause. You should be aware that the ORDER BY clause is used to sort the result set. With the EXPLICIT mode, the ORDER BY clause serves the same purpose, except it sorts the XML hierarchy instead of the result set. Let's see what happens if I remove the ORDER BY clause:
<authors au_fname="Marjorie" au_lname="green"> <titleauthor royaltyper="40"/> <titleauthor royaltyper="100"> <titles title="The Busy Executive's Database Guide"/> <titles title="You Can Combat Computer Stress!"/> </titleauthor> </authors>
Well, the world hasn't crashed, but now you have to wonder which title is earning the author Green 40% of royalties and which one earns 100%. If you only have a couple of nodes in your XML hierarchy you might be able to get away without the ORDER BY clause. However, as a rule of thumb, be sure to include ORDER BY in all queries using the EXPLICIT mode.
XML Explicit Tips and Tricks
Now that you're familiar with the basics of the Explicit mode it's time to learn a few tips that will save you much time when implementing your own solutions.