OPENXML is a built-in Transact-SQL function that can return an XML document as a rowset. In conjunction with sp_xml_preparedocument and sp_xml_removedocument, OPENXML allows you to break down (or shred) nonrelational XML documents into relational pieces that can be inserted into tables.
I suppose we should begin the investigation of how OPENXML works by determining where it's implemented. Does it reside in a separate DLL (SQLXMLn.DLL, perhaps?) or is it implemented completely within the SQL Server executable?
The most expedient way to determine this is to run SQL Server under a debugger, stop it in the middle of an OPENXML call, and inspect the call stack. That would tell us in what module it was implemented. Since we don't know the name of the classes or functions that implement OPENXML, we can't easily set a breakpoint to accomplish this. Instead, we will have to just be quick and/or lucky enough to stop the debugger in the right place if we want to use this approach to find out the module in which OPENXML is implemented. This is really easier said than done. Even with complicated documents, OPENXML returns fairly quickly, so breaking in with a debugger while it's in progress could prove pretty elusive.
Another way to accomplish the same thing would be to force OPENXML to error and have a breakpoint set up in advance to stop in SQL Server's standard error reporting routine. From years of working with the product and seeing my share of access violations and stack dumps, I know that ex_raise is a central error-reporting routine for the server. Not all errors go through ex_raise, but many of them do, so it's worth setting a breakpoint in ex_raise and forcing OPENXML to error to see whether we can get a call stack and ascertain where OPENXML is implemented. Exercise 18.3 will take you through the process of doing exactly that.
Exercise 18.3 Determining Where OPENXML Is Implemented
-
Restart your SQL Server, preferably from a console since we will be attaching to it with WinDbg. This should be a test or development system, and, ideally, you should be its only user.
-
Start Query Analyzer and connect to your SQL Server.
-
Attach to SQL Server using WinDbg. (Press F6 and select sqlservr.exe from the list of running tasks; if you have multiple instances, be sure to select the right one.)
-
Once the WinDbg command prompt appears, set a breakpoint in ex_raise:
bp sqlservr!ex_raise
-
Type g and press Enter so that SQL Server can continue to run.
-
Back in Query Analyzer, run this query:
declare @hDoc int set @hdoc=8675309 -- force a bogus handle select * from openxml(@hdoc,'/',1)
-
Query Analyzer should appear to hang because the breakpoint you set in WinDbg has been hit. Switch back to WinDbg and type kv at the command prompt and press Enter. This will dump the call stack. Your stack should look something like this (I've removed everything but the function names):
sqlservr!ex_raise sqlservr!CXMLDocsList::XMLMapFromHandle+0x3f sqlservr!COpenXMLRange::GetRowset+0x14d sqlservr!CQScanRmtScan::OpenConnection+0x141 sqlservr!CQScanRmtBase::Open+0x18 sqlservr!CQueryScan::Startup+0x10d sqlservr!CStmtQuery::ErsqExecuteQuery+0x26b sqlservr!CStmtSelect::XretExecute+0x229 sqlservr!CMsqlExecContext::ExecuteStmts+0x3b9 sqlservr!CMsqlExecContext::Execute+0x1b6 sqlservr!CSQLSource::Execute+0x357 sqlservr!language_exec+0x3e1 sqlservr!process_commands+0x10e UMS!ProcessWorkRequests+0x272 UMS!ThreadStartRoutine+0x98 (FPO: [EBP 0x00bd6878] [1,0,4]) MSVCRT!_beginthread+0xce KERNEL32!BaseThreadStart+0x52 (FPO: [Non-Fpo])
-
This call stack tells us a couple of things. First, it tells us that OPENXML is implemented directly by the server itself. It resides in sqlservr.exe, SQL Server's executable. Second, it tells us that a class named COpenXMLRange is responsible for producing the rowset that the T-SQL OPENXML function returns.
-
Type q and hit Enter to stop debugging. You will need to restart your SQL Server.
By reviewing this call stack, we can deduce how OPENXML works. It comes into the server via a language or RPC event (our code obviously came into the server as a language event—note the language_exec entry in the call stack) and eventually results in a call to the GetRowset method of the COpenXMLRange class. We can assume that GetRowset accesses the DOM document previously created via the call to sp_xml_preparedocument and turns it into a two-dimensional matrix that can be returned as a rowset, thus finishing up the work of the OPENXML function.
Now that we know the name of the class and method behind OPENXML, we could set a new breakpoint in COpenXMLRange::GetRowset, pass a valid document handle into OPENXML, and step through the disassembly for the method when the breakpoint is hit. However, we've got a pretty good idea of how OPENXML works; there's little to be learned about OPENXML's architecture from stepping through the disassembly at this point.