- XML Reference Guide
- Overview
- What Is XML?
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Table of Contents
- The Document Object Model
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- DOM and Java
- Informit Articles and Sample Chapters
- Books and e-Books
- Implementations
- DOM and JavaScript
- Using a Repeater
- Repeaters and XML
- Repeater Resources
- DOM and .NET
- Informit Articles and Sample Chapters
- Books and e-Books
- Documentation and Downloads
- DOM and C++
- DOM and C++ Resources
- DOM and Perl
- DOM and Perl Resources
- DOM and PHP
- DOM and PHP Resources
- DOM Level 3
- DOM Level 3 Core
- DOM Level 3 Load and Save
- DOM Level 3 XPath
- DOM Level 3 Validation
- Informit Articles and Sample Chapters
- Books and e-Books
- Documentation and Implementations
- The Simple API for XML (SAX)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- SAX and Java
- Informit Articles and Sample Chapters
- Books and e-Books
- SAX and .NET
- Informit Articles and Sample Chapters
- SAX and Perl
- SAX and Perl Resources
- SAX and PHP
- SAX and PHP Resources
- Validation
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Document Type Definitions (DTDs)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XML Schemas
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- RELAX NG
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Schematron
- Official Documentation and Implementations
- Validation in Applications
- Informit Articles and Sample Chapters
- Books and e-Books
- XSL Transformations (XSLT)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XSLT in Java
- Java in XSLT Resources
- XSLT and RSS in .NET
- XSLT and RSS in .NET Resources
- XSL-FO
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XPath
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XML Base
- Informit Articles and Sample Chapters
- Official Documentation
- XHTML
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XHTML 2.0
- Documentation
- Cascading Style Sheets
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XUL
- XUL References
- XML Events
- XML Events Resources
- XML Data Binding
- Informit Articles and Sample Chapters
- Books and e-Books
- Specifications
- Implementations
- XML and Databases
- Informit Articles and Sample Chapters
- Books and e-Books
- Online Resources
- Official Documentation
- SQL Server and FOR XML
- Informit Articles and Sample Chapters
- Books and e-Books
- Documentation and Implementations
- Service Oriented Architecture
- Web Services
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Creating a Perl Web Service Client
- SOAP::Lite
- Amazon Web Services
- Creating the Movable Type Plug-in
- Perl, Amazon, and Movable Type Resources
- Apache Axis2
- REST
- REST Resources
- SOAP
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- SOAP and Java
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- WSDL
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- UDDI
- UDDI Resources
- XML-RPC
- XML-RPC in PHP
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Ajax
- Asynchronous Javascript
- Client-side XSLT
- SAJAX and PHP
- Ajax Resources
- JSON
- Ruby on Rails
- Creating Objects
- Ruby Basics: Arrays and Other Sundry Bits
- Ruby Basics: Iterators and Persistence
- Starting on the Rails
- Rails and Databases
- Rails: Ajax and Partials
- Rails Resources
- Web Services Security
- Web Services Security Resources
- SAML
- Informit Articles and Sample Chapters
- Books and e-Books
- Specification and Implementation
- XML Digital Signatures
- XML Digital Signatures Resources
- XML Key Management Services
- Resources for XML Key Management Services
- Internationalization
- Resources
- Grid Computing
- Grid Resources
- Web Services Resource Framework
- Web Services Resource Framework Resources
- WS-Addressing
- WS-Addressing Resources
- WS-Notifications
- New Languages: XML in Use
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Google Web Toolkit
- GWT Basic Interactivity
- Google Sitemaps
- Google Sitemaps Resources
- Accessibility
- Web Accessibility
- XML Accessibility
- Accessibility Resources
- The Semantic Web
- Defining a New Ontology
- OWL: Web Ontology Language
- Semantic Web Resources
- Google Base
- Microformats
- StructuredBlogging
- Live Clipboard
- WML
- XHTML-MP
- WML Resources
- Google Web Services
- Google Web Services API
- Google Web Services Resources
- The Yahoo! Web Services Interface
- Yahoo! Web Services and PHP
- Yahoo! Web Services Resources
- eBay REST API
- WordML
- WordML Part 2: Lists
- WordML Part 3: Tables
- WordML Resources
- DocBook
- Articles
- Books and e-Books
- Official Documentation and Implementations
- XML Query
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- XForms
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Resource Description Framework (RDF)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Topic Maps
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation, Implementations, and Other Resources
- Rich Site Summary (RSS)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- Simple Sharing Extensions (SSE)
- Atom
- Podcasting
- Podcasting Resources
- Scalable Vector Graphics (SVG)
- Informit Articles and Sample Chapters
- Books and e-Books
- Official Documentation
- OPML
- OPML Resources
- Summary
- Projects
- JavaScript TimeTracker: JSON and PHP
- The Javascript Timetracker
- Refactoring to Javascript Objects
- Creating the Yahoo! Widget
- Web Mashup
- Google Maps
- Indeed Mashup
- Mashup Part 3: Putting It All Together
- Additional Resources
- Frequently Asked Questions About XML
- What's XML, and why should I use it?
- What's a well-formed document?
- What's the difference between XML and HTML?
- What's the difference between HTML and XHTML?
- Can I use XML in a browser?
- Should I use elements or attributes for my document?
- What's a namespace?
- Where can I get an XML parser?
- What's the difference between a well-formed document and a valid document?
- What's a validating parser?
- Should I use DOM or SAX for my application?
- How can I stop a SAX parser before it has parsed the entire document?
- 2005 Predictions
- 2006 Predictions
- Nick's Book Picks
In this section of the XML and Web Services Guide, we are building a simple RSS feed reader using Ajax. In the previous entry, we created a basic page that uses asynchronous Javascript to load new information such as subcategories for a particular category or feeds for a particular subcategory. The page requests the information using an HTTP request, and then adds it to the page. When we last left our project, we had brought it to the point at which we were requesting the actual RSS feed and displaying it, raw, on the page.
Of course, as just a jumble of text, the information isn't very useful. Instead, we want to take the raw XML and turn it into HTML. Now, you might think that this is a task for XSL Transformations. You'd be right. But we're not going to perform the transformation on the server. Instead, we're going to perform the transformation right in the browser.
Here's how the process is going to work:
- First, load the stylesheet when the browser originally loads the page.
- Drill down to the feed level.
- Download the feed.
- Create a DOM
Document
out of the feed. - Use the stylesheet to transform the
Document
. - Display the transformed
Document
on the page.
Let me start by admitting that I'm going to cheat here, just a little. There are something like eight different RSS and RSS-like feed formats out there in the wild, and I could spend a large amount of time talking about the specifics of the actual XSLT stylesheet, but that's not what this entry is about. It's about performing a transformation -- any transformation -- in the browser. So instead, we'll create a simple HTML document using a simple stylesheet that pulls only the most basic of information from the most common of formats. (We'll leave the creation of a more comprehensive solution to as an exercise for the reader.)
Let's start by taking a quick look at two of the most common formats, RSS .9x and RSS 1.0. A sample RSS .91 feed looks something like this:
<?xml version="1.0" encoding="ISO-8859-1"?> <rss version="0.91"> <channel> <title>The Vanguard Science Fiction Report</title> <link>http://www.vanguardreport.com</link> <description>The Vanguard Science Fiction Report</description> <language>en-us</language> <item> <title>Still here...</title> <link>http://www.vanguardreport.com/phpnuke/modules.php?name=News&file=rssArticle&sid=857</link> <description>No, I haven't abandoned this site, I've just been overwhelmed lately. (Check out my personal blog if ...</description> </item> <item> <title>Serenity trailer hits the web</title> <link>http://www.vanguardreport.com/phpnuke/modules.php?name=News&file=rssArticle&sid=856</link> <description>The trailer for the film version of Firefly, Serenity, is now available on the web. I'm hoping they ...</description> </item> ... </channel> </rss>
Both basic information about the feed and a set of item
elements are contained in
the channel
element, which is itself contained in the root element. An RSS 1.0
feed is similar, with, among other things, three important differences:
<?xml version="1.0" encoding="iso-8859-1"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" ... xmlns="http://purl.org/rss/1.0/"> <channel rdf:about="http://www.chaosmagnet.com/blog/"> <title>Chaos Magnet</title> <link>http://www.chaosmagnet.com/blog/</link> <description>The personal and professional ramblings of Nicholas Chase.</description> ... </channel> <item rdf:about="http://www.chaosmagnet.com/blog/archives/000649.html"> <title>Musings on life ... and veterans</title> <link>http://www.chaosmagnet.com/blog/archives/000649.html</link> <description>It's a weekend for closure after Ray's crossing, and I think I've pretty much settled things in my own head. Let me warn you that this is a long post -- at least for me -- and that unlike most...</description> ... </item> <item rdf:about="http://www.chaosmagnet.com/blog/archives/000648.html"> <title>The blog is complete: The Darth Side</title> <link>http://www.chaosmagnet.com/blog/archives/000648.html</link> <description>I'm still kind of reeling here, trying to finish funeral arrangements, but I took a break and found that The Darth Side: Memoirs of a Monster has come to the end of its run. I don't usually gush about blogs,...</description> ... </item> ... </rdf:RDF>
In this case, the overall structure is similar, but the three important exceptions are the presence
of namespaces, the fact that the root element is RDF
instead of rss
,
and the fact that the item
elements are children of the root element and not the
channel
element. So what we need to do is create an XSLT style sheet that
applies to both structures:
<?xml version='1.0'?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/*[1]"> <html> <body> <h2><xsl:value-of select="*[1]/*[local-name()='title']" /></h2> <table> <xsl:for-each select="*[local-name()='item']"> <tr> <td> <xsl:element name="a"> <xsl:attribute name="href"><xsl:value-of select="*[local-name()='link']" /></xsl:attribute> <xsl:attribute name="target">_blank</xsl:attribute> <b><xsl:value-of select="*[local-name()='title']"/></b> </xsl:element> <br /> <xsl:value-of select="*[local-name()='description']"/> </td> </tr> </xsl:for-each> <xsl:for-each select="*[1]/*[local-name()='item']"> <tr> <td> <xsl:element name="a"> <xsl:attribute name="href"><xsl:value-of select="*[local-name()='link']" /></xsl:attribute> <xsl:attribute name="target">_blank</xsl:attribute> <b><xsl:value-of select="*[local-name()='title']"/></b> </xsl:element> <br /> <xsl:value-of select="*[local-name()='description']"/> </td> </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet>
Notice that because we're dealing with potentially different structures, rather than
selecting the root element by name, I'm selecting it by position. The asterisk (*
)
selects all of the child nodes, and the predicate (the part in brackets ([]
))
indicates the position within the list. So our main template selects the first child of
the document root. From there, I'm selecting elements based on their local-name()
,
which is the same whether we're using namespaces or not.
Finally, I'm displaying any item
elements that are children of the
channel
element or the root element. In any given feed, only one
set will be present, so we can use this stylesheet for both structures.
Is this an exhaustive stylesheet for any and all syndicated feeds? Of course not. But that's not what we're here to discuss today. We're here to explain how to run the transformation in the browser.
When last we left our document, we had implemented code that would asynchronously request an HTML file (or any other file, for that matter) and display it on the page. With that in place, it seems natural that if we request any other files, such as an XSLT style sheet, we should probably do it asynchronously. So let's start with that.
Because we only have a single stylesheet to load, it would be silly to load it every time we load a new feed, so let's go ahead and load it asynchronously when we load the page. First we'll create the request:
<script type="text/javascript"> var req; var styleReq; var dest; ... function loadStylesheet(){ if (window.XMLHttpRequest){ url = "http://www.nicholaschase.com/ajaxdemo/rss1.xsl"; styleReq = new XMLHttpRequest(); styleReq.onreadystatechange = processStylesheetChange; styleReq.open("GET", url, true); styleReq.send(null); } else if (window.ActiveXObject) { url = "http://www.nicholaschase.com/ajaxdemo/rss1ie.xsl"; styleReq = new ActiveXObject("Microsoft.XMLHTTP"); if (styleReq) { styleReq.onreadystatechange = processStylesheetChange; styleReq.open("GET", url, true); styleReq.send(); } } } </script> </head> <body onload="loadStylesheet()"> <table width="100%" border="0"> ...
The loadStylesheet()
function should look familiar, because it's
virtually identical to the loadHTML()
function we created to load the
content in the first place. The differences here are that a) we don't need a destination
div, and b) we're not passing in a URL. No, in this case, we're specifically setting
the URL for the style sheet within the function, based on which browser we're using.
The XSL transformation engine in Internet Explorer doesn't do well with namespaces, so
here we have a chance to create a separate style sheet to get around that problem.
In either case, we're creating a new request, styleReq
, which loads
asynchronously. Because of that, just as we did with the HTML requests, we need an event handler
to actually process the data. In this case, it's processStylesheetChange()
:
var req; var styleReq; var stylesheetDoc; var dest; ... function processStylesheetChange(){ if (styleReq.readyState == 4){ if (styleReq.status == 200){ if (window.XMLHttpRequest){ var dp = new DOMParser(); stylesheetDoc = dp.parseFromString(styleReq.responseText, "text/xml"); } else if (window.ActiveXObject) { stylesheetDoc = new ActiveXObject("Microsoft.XMLDOM"); stylesheetDoc.async = false; stylesheetDoc.loadXML(styleReq.responseText); } } else { alert("Can't load stylesheet:"+styleReq.status); } } } ...
Here's where things get interesting. The whole point of this excercise is to use this stylesheet to transform any XML data
we load, so we need to get the stylesheet into a DOM Document
. In an ideal world, we could simply
assign it by requesting responseXML
instead of responseText
, but that makes the
assumption that the target web server is set up to send the proper MIME type for XML files. Unfortunately, many aren't,
and that includes some of the largest web hosting companies on the planet. So we get around that by actually parsing the text returned by the styleReq
request.
For Mozilla-based
browsers, this means using the built-in DOMParser
object. First we instantiate it, and then we use it
to parse the string data of the request as though it came in as an HTTP response with the MIME type text/xml
.
For Internet Explorer, we take a different tactic. First, we create a new XMLDOM
ActiveX object.
Because the data is already present, we'll make our lives easier by performing the parsing synchronously. From there,
we simply load the XML text.
Now we have the style sheet in a DOM Document
, ready for use when we load a feed. Let's look
at how to actually use it:
... function processStateChange(){ statusDiv = document.getElementById("status"); ... if (req.readyState == 4){ if (req.status == 200){ response = req.responseText; if (dest == "feed"){ if (window.XMLHttpRequest){ var parser = new DOMParser(); theDocument = parser.parseFromString(req.responseText, "text/xml"); var xsltProcessor = new XSLTProcessor(); xsltProcessor.importStylesheet(stylesheetDoc); response = xsltProcessor.transformToFragment(theDocument, document); destinationDiv = document.getElementById(dest); destinationDiv.innerHTML = ""; destinationDiv.appendChild(response); } else if (window.ActiveXObject) { ... } } else { destinationDiv = document.getElementById(dest); destinationDiv.innerHTML = response; } } else { statusDiv.innerHTML = "Error: Status "+req.status; } } } ...
First off, we'll check to see whether we even need to perform the transformation.
We'll know that by the destination of our content; only an RSS feed goes in the feed
div. From there, it's a simple matter of performing the transformation and adding the
results to the feed
div.
It probably comes as no surprise that the way in which we accomplish that depends on the
browser we're using. For Mozilla, we'll first make a DOM Document
out of the
actual content, using the DOMParser
, as we did with the style sheet. Next,
we'll create a new XSLTProcessor
object and import the style sheet it should use
for any transformations it performs. Next, we perform the actual transformation.
In this case, we're using the transformToFragment()
function, passing in
the node to transform (theDocument
) and the owner Document
for the
resulting DocumentFragment
object. (Remember, nodes don't just float out there
in the ether; they need to have a parent Document
, even if they aren't actually
attached to it in a specific location.) Mozilla's XSLTProcessor
also enables you
to transformToDocument()
, replacing the actual page.
Once we have the transformed DocumentFragment
, we're ready to add it to the
page. To do that, we'll get a reference to the feed
div, clear its contents,
and then append the actual fragment (and thus, all of its children) to the div.
The overall process -- create Document
, transform, add to the page -- is the same for
Internet Explorer, but we'll handle it a little differently:
... function processStateChange(){ statusDiv = document.getElementById("status"); ... if (req.readyState == 4){ if (req.status == 200){ response = req.responseText; if (dest == "feed"){ if (window.XMLHttpRequest){ ... } else if (window.ActiveXObject) { var theDocument = new ActiveXObject("Microsoft.XMLDOM"); theDocument.async = false; theDocument.loadXML(req.responseText); destinationDiv = document.getElementById(dest); destinationDiv.innerHTML = theDocument.transformNode(stylesheetDoc); } } else { destinationDiv = document.getElementById(dest); destinationDiv.innerHTML = response; } } else { statusDiv.innerHTML = "Error: Status "+req.status; } } } ...
As before, with the style sheet, we'll create the Document
as a
Microsoft.XMLDOM
object, loading it with the text of the response.
In this case, however, we don't need to create an XSLTProcessor
;
the ability to transform a node based on a stylesheet is built-in to the transformNode()
function returns the transformed text, making it
simple to add it ot the page.
The result is a page that displays the transformed XML, ready to be clicked: