Generating XML Documents from Databases
One of the most popular uses of XML is as a data storage facility. Because XML is based on a hierarchical data format, it's suitable to store almost any information, and it has the capability to mimic some of the capabilities found in relational database management systems (RDBMSs). However, since XML is stored in a text format, data cannot be retrieved as quickly or stored as efficiently when compared to the binary formats used by modern RDBMSs.
XML is often used to transfer data between disparate systems. A large number of systems today exchange data stored in XML even though they were never originally designed to do so. An example of such a system is an e-commerce business. Sometimes, orders may be collected by one company and the merchandise is shipped by another company. In this type of business arrangement, the companies could use XML to exchange order information. Usually, the order data is stored in an underlying database, so the key for this type of transaction is to convert order data stored in a database into an XML document.
This article discusses how you can use Perl to produce XML documents using a database as your data source. We'll focus on the XML::Generator::DBI.
NOTE
You'll need to install the following Perl modules to test the code examples in this article:
XML::Generator::DBI
DBI
DBD::CSV
XML::Generator::DBI Perl Module
One of the most frequently performed tasks while integrating XML and RDBMS formats is transforming the query results returned by the database engine into XML data (either in memory, or written out in the form of an XML document). The Perl DBI is a database independent interface for querying and inserting data into a large number of databases. It enables you to run SQL queries to extract data from a DBMS and returns this data in a Perl data structure. You can then easily convert the results of an SQL query into XML data by using the Perl XML::DBI::Generator module that was written by Matt Sergeant.
This module enables you to generate Simple API for XML (SAX) events that are created as the result of an SQL query to a RDBMS. Basically, this facilitates the transformation of the results of an SQL RDBMS query directly to XML.
The XML output can easily be customized by setting attributes when you instantiate an XML::DBI::Generator object. After the desired attributes are set, you can make the SQL query, and using a handler module (for example, XML::Handler::YAWriter), you can generate the SQL query results in XML. The module's wide range of supported attributes provides a number of different options. However, if this module still doesn't support something you need to do, you can easily write your own handler and customize the behavior. This demonstrates the power and flexibility of SAX-based XML parsing.
Let's now look at a scenario where XML::Generator::DBI can be put to work. We purposely avoided using a RDBMS in this first example; this scenario demonstrates the utility of the XML::Generator::DBI Perl module without requiring you to install a RDBMS. Depending on the RDBMS that you select, setup can range anywhere from trivial to challenging. This first example uses a standard Comma Separated Value (CSV) file as our flat file database system. DBI has a driver for CSV that can manipulate the file as if it were a real RDBMS. This example only requires installation of the Perl DBI and DBD::CSV modules.
XML::DBI::Generator Perl Module and CSV Example
In this example, we want to develop a Perl program that performs the following steps:
Create a table in the database.
Insert data into the table.
Query the database.
Convert the results of the query into XML.
Sounds like a real task requiring some code writing, right? In reality, this program is a lot shorter than you might think. This example requires less than one-half of a page of source code. This example demonstrates how quickly and easily you can create a real application by taking advantage of existing Perl modules. The Perl modules handle the low-level details, allowing you to focus on other aspects of the application.
XML::DBI::Generator Perl Program
Now, let's take a closer look at the Perl program shown in Listing 1 that performs these tasks. In this example, we'll be using DBI and the DBD::CSV driver to create a database populated with some sample data that queries the data and generates a result set in XML format.
Listing 1Perl application using the XML::Generator::DBI module and DBI Perl modules. (Filename: Listing 1)
1. use strict; 2. use XML::Generator::DBI; 3. use XML::Handler::YAWriter; 4. use DBI; 5. 6. # Instantiate a new XML::Handler::YAWriter object. 7. my $ya = XML::Handler::YAWriter->new(AsFile => "-", 8. Pretty => {PrettyWhiteNewline => 1, 9. PrettyWhiteIndent => 1}); 10. 11. # Create a DBI connection. 12. my $dbh = DBI->connect("dbi:CSV:f_dir=./"); 13. $dbh->{RaiseError} = 1; 14. 15. # Create a database table with columns named id and name. 16. $dbh->do("CREATE TABLE USERS (ID INTEGER, NAME CHAR(10))"); 17. 18. # Insert data into the database table. 19. $dbh->do("INSERT INTO USERS (ID, NAME) VALUES (1, 'Larry Wall')"); 20. $dbh->do("INSERT INTO USERS (ID, NAME) VALUES (2, 'Tim Bunce')"); 21. $dbh->do("INSERT INTO USERS (ID, NAME) VALUES (3, 'Matt Sergeant')"); 22. $dbh->do("INSERT INTO USERS (ID, NAME) VALUES (4, 'Ilya Sterin')"); 23. $dbh->do("INSERT INTO USERS (ID, NAME) VALUES (5, 'Robin Berjon')"); 24. 25. # Instantiate a new XML::Generator::DBI object. 26. my $generator = XML::Generator::DBI->new( 27. Handler => $ya, 28. dbh => $dbh); 29. 30. # Execute the enclosed SQL query. 31. $generator->execute("SELECT ID, NAME FROM USERS"); 32. 33. # Remove the USERS table from the database. 34. $dbh->do("DROP TABLE USERS"); 35. 36. # Disconnect from the CSV database. 37. $dbh->disconnect();
19This program starts with the standard use strict pragma. Because we're using the XML::Generator::DBI, XML::Handler::YAWriter, and the DBI Perl modules, we need to include their respective use pragmas to load the modules. We're going to use the XML::Handler::YAWriter Perl module to serve as our event handler.
1. use strict; 2. use XML::Generator::DBI; 3. use XML::Handler::YAWriter; 4. use DBI; 5. 6. # Instantiate a new XML::Handler::YAWriter object. 7. my $ya = XML::Handler::YAWriter->new(AsFile => "-", 8. Pretty => {PrettyWhiteNewline => 1, 9. PrettyWhiteIndent => 1});1113In this section, we call the DBI module's connect() function to connect a DBI object. The RaiseError attribute is then set to true; this will cause any DBI errors to terminate the program with an error message.
11. # Create a DBI connection 12. my $dbh = DBI->connect("dbi:CSV:f_dir=./"); 13. $dbh->{RaiseError} = 1;1516We call the DBI do() function, which allows us to immediately execute a query that does not require us to return a result set (for example, a CREATE or INSERT SQL statement). In this case, we execute the CREATE TABLE SQL statement that will create a new table named USERS with two fields, ID and NAME. Because we're using the DBD::CSV DBI driver case, this function will create a new file named USERS, which will act as our table.
15. # Create a database table with columns named id and name. 16. $dbh->do("CREATE TABLE USERS (ID INTEGER, NAME CHAR(10))");1823Now we insert some sample data into our newly created USERS table.
18. # Insert data into the database table. 19. $dbh->do("INSERT INTO USERS (ID, NAME) VALUES (1, 'Larry Wall')"); 20. $dbh->do("INSERT INTO USERS (ID, NAME) VALUES (2, 'Tim Bunce')"); 21. $dbh->do("INSERT INTO USERS (ID, NAME) VALUES (3, 'Matt Sergeant')"); 22. $dbh->do("INSERT INTO USERS (ID, NAME) VALUES (4, 'Ilya Sterin')"); 23. $dbh->do("INSERT INTO USERS (ID, NAME) VALUES (5, 'Robin Berjon')");
It just happens that I listed some of the major contributors in the Perl community. The contents of the database are show in Table 6.2.
Table 6.2Contents of the database table.
ID |
Name |
1 |
Larry Wall |
2 |
Tim Bunce |
3 |
Matt Sergeant |
4 |
Ilya Sterin |
5 |
Robin Berjon |
2528Instantiate a XML::Generator::DBI object, setting its Handler and dbh attributes. Here we set XML::Handler::YAWriter as the SAX event handler that will receive events generated by the XML::Generator::DBI Perl module. In addition, we set the DBI database handle object ($dbh) to dbh. This is the database handle that the XML::Generator::DBI Perl module uses to execute database queries.
25. # Instantiate a new XML::Generator::DBI object. 26. my $generator = XML::Generator::DBI->new( 27. Handler => $ya, 28. dbh => $dbh);3031Here we call the XML::Generator::DBI execute() method, which actually executes our query and starts the transformation process from a result set to XML. Because XML::Handler::YAWriter's AsFile attribute was set to '-' in the XML::Handler::YAWriter constructor, the generated XML will be printed to STDOUT. If you need to generate an XML document instead of having it scroll by on STDOUT, supply an output filename to the XML::Handler::YAWriter constructor instead of the '-'.
30. # Execute the enclosed SQL query. 31. $generator->execute("SELECT ID, NAME FROM USERS");3337After making our queries and generating, we can now delete our USERS table and disconnect from the database. We execute the DROP TABLE function and then call the disconnect() function on the database handle.
33. # Remove the USERS table from the database. 34. $dbh->do("DROP TABLE USERS"); 35. 36. # Disconnect from the CSV database 37. $dbh->disconnect();
The output of our XML::Generator::DBI Perl program is shown in Listing 2. As you can see, each <row> element in the XML document directly maps to a row from the database table. Note that the row elements and the root element in the generated XML document can be changed by using attributes from the XML::Generator::DBI module constructor.
Listing 2Output of the XML::Generator::DBI Perl program. (Filename: Listing 2)
<?xml version="1.0" encoding="UTF-8"?> <database> <select query="SELECT ID, NAME FROM USERS"> <row> <ID>1</ID> <NAME>Larry Wall</NAME> </row> <row> <ID>2</ID> <NAME>Tim Bunce</NAME> </row> <row> <ID>3</ID> <NAME>Matt Sergeant</NAME> </row> <row> <ID>4</ID> <NAME>Robin Berjon</NAME> </row> <row> <ID>5</ID> <NAME>Ilya Sterin</NAME> </row> </select> </database>