Summary
In this article, you’ve seen how you can use XSL transformations to generate clean and simple HTML output from XML-based WordProcessingML format generated by Office 2003 Professional Edition. The WordProcessingML format is also the default format that Office 2007 uses to store its documents.
Throughout the article, we’ve slowly progressed from simple transformation that gives you only a sequence of paragraphs within the basic HTML framework, to more complex paragraph structure including paragraph styles and headings, and concluding with transformation of Word text ranges into SPAN, STRONG, and EM HTML tags. On my web site, you’ll find further XSL transformations that process tables, images, and hyperlinks.
While the usage of WordProcessingML and XSL transformations might present a steep learning curve for someone who is unfamiliar with the XML technologies, the rewards you can gain in a large environment using Microsoft Word could be enormous—more so if you can use the same set of technologies to generate Word-compliant formatted text from other sources capable of generating XML output (for example, from data in SQL databases or from external RSS feeds).