Relational Databases
Relational databases are another common source of data for XML documents. Some databases such as FileMaker Pro have built-in support for outputting tables as XML documents. Others do not. However, even if your database can export tables as XML documents, its XML format may not be the XML format you want. Fortunately, as long as there's a JDBC driver for your database of choice, it's not hard to extract the information from it and write that information into an XML document in the desired form.
For this example, I'll use the same budget data previously read out of a CSV file in a single relational table that reflects the original flat structure of the files distributed by the Office of Management and Budget. Doubtless they have the data in their own relational databases, probably divided up into multiple tables; but they don't publish it that way. They do state, “If you plan to use these data in a relational database, you should designate the following fields as 'primary' to uniquely identify each row of data: agency code, bureau code, account code, subfunction code, BEA category, Grant/Nongrant, and On- Off-budget field.” The SQL CREATE TABLE statement that initializes this table is as follows:
CREATE TABLE BudgetAuthorizationTable ( AgencyCode CHAR(3), AgencyName VARCHAR(89), BureauCode CHAR(2), BureauName VARCHAR(89), AccountCode VARCHAR(6), AccountName VARCHAR(160), TreasuryAgencyCode CHAR(2), SubfunctionCode CHAR(3), SubfunctionTitle VARCHAR(72), BEACategory VARCHAR(13), On-Off-BudgetIndicator VARCHAR(10), FY1976 INTEGER, TransitionQuarter INTEGER, FY1977 INTEGER, FY1978 INTEGER, FY1979 INTEGER, FY1980 INTEGER, FY1981 INTEGER, FY1982 INTEGER, FY1983 INTEGER, FY1984 INTEGER, FY1985 INTEGER, FY1986 INTEGER, FY1987 INTEGER, FY1988 INTEGER, FY1989 INTEGER, FY1990 INTEGER, FY1991 INTEGER, FY1992 INTEGER, FY1993 INTEGER, FY1994 INTEGER, FY1995 INTEGER, FY1996 INTEGER, FY1997 INTEGER, FY1998 INTEGER, FY1999 INTEGER, FY2000 INTEGER, FY2001 INTEGER, FY2002 INTEGER, FY2003 INTEGER, FY2004 INTEGER, FY2005 INTEGER, FY2006 INTEGER, PRIMARY KEY (AgencyCode, BureauCode, AccountCode, SubfunctionCode, BEACategory, On-Off-BudgetIndicator) );
The specific database I chose for this example is Microsoft Excel, mostly because it could very easily read the CSV files with which I started. Excel isn't the best example of a relational database. In fact, it isn't a relational database at all. But Excel does allow you to define a range of cells as a table, and then associate that table with an ODBC data source. This data source can then be read with SQL using JDBC through the JdbcOdbcDriver, which is all I really want to show here. Aside from the choice of JDBC driver, all statements will be completely database independent. The name of the ODBC data source is budauth. The name of the table in that source is BudgetAuthorizationTable.
Extracting the necessary data from the database is just a matter of SQL, which as a Java developer, you access via JDBC. To some extent you can let the database do the hard work for you by executing the right sequence of SQL commands. In this case, this simply consists of a long sequence of nested SELECT statements. SQL 's DISTINCT operator will be particularly helpful. The contortions of the Muenchian method in Example 4.12 were a roundabout way of providing a distinct operation in XSLT. Example 4.14 demonstrates.
Example 4.14 A Program That Connects to a Relational Database Using JDBC and Converts the Table to Hierarchical XML
import java.sql.*; import java.io.*; public class SQLToXML { public static void main(String[] args ) { // Load the ODBC driver try { Class.forName( "sun.jdbc.odbc.JdbcOdbcDriver" ); } catch (ClassNotFoundException e) { System.err.println("Could not load the JDBC-ODBC Bridge"); return; } try { Writer out = new OutputStreamWriter(System.out, "UTF8"); out.write("<?xml version=\"1.0\"?>\r\n"); out.write("<Budget>\r\n"); writeAgencies(out); out.write("</Budget>\r\n"); out.close(); } catch (IOException e) { System.err.println(e); } } private static void writeAgencies(Writer out) throws IOException { Connection conn = null; Statement stmnt = null; try { conn = DriverManager.getConnection( "jdbc:odbc:budauth", "", ""); stmnt = conn.createStatement(); String query = "SELECT DISTINCT AgencyName, AgencyCode" + " FROM BudgetAuthorizationTable;"; ResultSet agencies = stmnt.executeQuery( query ); while( agencies.next() ) { String agencyName = agencies.getString("AgencyName"); agencyName = escapeText(agencyName); String agencyCode = agencies.getString("AgencyCode"); out.write(" <Agency>\r\n"); out.write(" <Name>" + agencyName + "</Name>\r\n"); out.write(" <Code>" + agencyCode + "</Code>\r\n"); writeBureaus(out, conn, agencyCode); out.write(" </Agency>\r\n"); } } catch (SQLException e) { System.err.println(e); e.printStackTrace(); } finally { try { stmnt.close(); conn.close(); } catch(SQLException e) { System.err.println(e); } } } private static void writeBureaus(Writer out, Connection conn, String agencyCode) throws IOException, SQLException { String query = "SELECT DISTINCT BureauName, BureauCode " + "FROM BudgetAuthorizationTable WHERE AgencyCode='" + agencyCode + "';"; Statement stmnt = conn.createStatement(); ResultSet bureaus = stmnt.executeQuery(query); while( bureaus.next() ) { String bureauName = bureaus.getString("BureauName"); bureauName = escapeText(bureauName); String bureauCode = bureaus.getString("BureauCode"); out.write(" <Bureau>\r\n"); out.write(" <Name>" + bureauName + "</Name>\r\n"); out.write(" <Code>" + bureauCode + "</Code>\r\n"); writeAccounts(out, conn, agencyCode, bureauCode); out.write(" </Bureau>\r\n"); out.flush(); } } private static void writeAccounts(Writer out, Connection conn, String agencyCode, String bureauCode) throws IOException, SQLException { String query = "SELECT DISTINCT AccountName, AccountCode " + "FROM BudgetAuthorizationTable WHERE AgencyCode='" + agencyCode + "' AND BureauCode='" + bureauCode + "';"; Statement stmnt = conn.createStatement(); ResultSet accounts = stmnt.executeQuery(query); while( accounts.next() ) { String accountName = accounts.getString("AccountName"); accountName = escapeText(accountName); String accountCode = accounts.getString("AccountCode"); out.write(" <Account>\r\n"); out.write(" <Name>" + accountName + "</Name>\r\n"); out.write(" <Code>" + accountCode + "</Code>\r\n"); writeSubfunctions( out, conn, agencyCode, bureauCode, accountCode ); out.write(" </Account>\r\n"); out.flush(); } } private static void writeSubfunctions(Writer out, Connection conn, String agencyCode, String bureauCode, String accountCode) throws IOException, SQLException { String query = "SELECT * FROM BudgetAuthorizationTable" + " WHERE AgencyCode='" + agencyCode + "' AND BureauCode='" + bureauCode + "' AND AccountCode='" + accountCode + "';"; Statement stmnt = conn.createStatement(); ResultSet subfunctions = stmnt.executeQuery(query); while( subfunctions.next() ) { String subfunctionTitle = subfunctions.getString("SubfunctionTitle"); subfunctionTitle = escapeText(subfunctionTitle); String subfunctionCode = subfunctions.getString("SubfunctionCode"); out.write(" <Subfunction>\r\n"); out.write(" <Name>"); out.write(subfunctionTitle); out.write("</Name>\r\n"); out.write(" <Code>"); out.write(subfunctionCode); out.write("</Code>\r\n"); out.write(" <Amount year='TransitionQuarter'>"); out.write(subfunctions.getInt("TransitionQuarter") + "</Amount>\r\n"); for (int year = 1976; year <= 2006; year++) { String name = "FY" + year; long amt = subfunctions.getInt(name) * 1000L; out.write(" <Amount year='" + year + "'>"); out.write(amt + "</Amount>\r\n"); } out.write(" </Subfunction>\r\n"); out.flush(); } } public static String escapeText(String s) { if (s.indexOf('&') != -1 || s.indexOf('<') != -1 || s.indexOf('>') != -1 || s.indexOf('"') != -1 || s.indexOf('\'') != -1 ) { StringBuffer result = new StringBuffer(s.length() + 6); for (int i = 0; i < s.length(); i++) { char c = s.charAt(i); if (c == '&') result.append("&"); else if (c == '<') result.append("<"); else if (c == '"') result.append("""); else if (c == '\'') result.append("'"); else if (c == '>') result.append(">"); else result.append(c); } return result.toString(); } else { return s; } } }
The basic approach here should be quite familiar by now. Tags are stored in string literals. These tags are written onto a Writer along with element content and attribute values that have been read from the input. The difference in this case is that because the input comes from a relational database, the program can use SQL to get the input it wants when it wants it. The Java program does not need to put itself out to accommodate the order of the input data. In essence this program is nothing more than four nested loops. The outermost loop iterates over the different agencies. This contains a loop that iterates over the bureaus within that agency. This contains a loop that iterates over the accounts within that bureau. This contains the innermost loop, which iterates over the subfunctions within the account. Compare this to the contortions the other programs had to perform. Only the XQuery solution was as straightforward as this—not surprising, because it also allows the client to specify which data it wants to receive.