Generating XML Documents
Where do XML documents come from? It's kind of like the famous question: Which came first, the chicken or the egg? You can't parse an XML document until you build one. The source data for an XML document can come from just about any imaginable source. For example, an XML document can be generated from a CSV file, a tab-delimited text file, the results of a database query, a Common Gateway Interface (CGI) web-based user input form that collects data from a user, Web services, and many, many other sources. You can see that there is an almost infinite number of data sources for XML documents.
Let's take a look at two methods of generating XML documents. First, an XML document can be created in a standard text editor or an XML-specific What You See Is What You Get (WYSIWYG) editor. This is called static generation because the XML document is created by typing the document into a text editor. The data in the file remains the same unless someone or something (that is, an application) modifies it. There isn't too much to ityou basically just type the contents of the XML file as you would type any other document. This is an option for very small XML files (for example, configuration files); however, it isn't practical for larger files. The larger and more complicated an XML document is, the greater the chance of error if the file is edited by hand.
XML documents can also be generated dynamically by an application. Dynamically generating XML is more interesting and applicable than statically generated documents, so let's take a more detailed look at this topic in the following section.
Dynamically Generating XML Documents
Another method of generating an XML document is to dynamically generate the contents using an application (preferrably Perl based). Because XML is just plain text, an entire XML document can be easily contained within a Perl scalar. It also can be printed to any filehandle with a simple print command. Of course, plenty of modules are available that generate XML, which takes away some of complexities of doing everything yourself. We'll discuss these modules in Part III of this book, "Generating XML Documents Using Perl Modules." Some of the topics discussed include generating XML documents from text files, databases, and even other XML documents.
A good example to demonstrate dynamic XML file generation is a web form that is filled out by a user and then submitted to a Perl application running on the web server. This is certainly dynamic because we don't know what the user will enter as data. Assume that the web form shown in Figure 2.3 allows the user to fill out certain information and then submit the collected information to the server. The information provided by the user will be processed by a Perl script and used as input or source information for an XML document. An XML document will be generated containing the information provided by the user.
Figure 2.3 Web-based HTML submission form used to collect data for our XML document.
We won't get into too much detail here, but we will revisit this example in Chapter 5, "Generating XML Documents from Text Files." The purpose here is to provide a few high-level examples to illustrate the concepts instead of diving right into source code.
When the Perl application receives the information from the form, it comes in a certain format. The Perl application can then parse this format to separate each field individually. Let's say the user fills in the required information and then submits the form. Now, inside our Perl application, we have the following data:
First Name: John Last Name: Smith Address: 1155 Perl Drive City: Los Angeles State: California Zip: 90102 Phone Number: 818-555-1212
The Perl script can now process and modify the data and then generate an XML document containing this information. Listing 2.2 shows the submitted data in an XML document. This file now can be processed as an XML document. For example, now that the information submitted by the user is in XML, you have a variety of options. The XML document can be transformed to another format (using XSLT), searched (using Xpath), or sent to another application as a document that is based on an agreed-to format.
Listing 2.2 Sample XML file generated from the web-based form input. (Filename: ch2_webform_input.xml)
<?xml version="1.0"?> <submission> <first_name>John</first_name> <last_name>Smith</last_name> <address>1155 Perl Drive.</address> <city>Los Angeles</city> <state>CA</state> <zip>90102</zip> <phone_number>818-555-1212</phone_number> </submission>
This wasn't a very complicated example. As you can see, we've mapped the web form names (for example, first name, last name, and so forth) to element names inside the XML document for consistency. This is a good example of when you would dynamically generate XML.
Now that we have discussed XML file generation, we can proceed to more complex issues, such as searching XML data and transforming XML into other data formats.