Markup Languages: Why XML?
Chapter 1: Why XML?
In which it is revealed where my personal experience of markup languages began.
In this chapter, I take you through some of my initial experiences with markup languages, experiences that led me to be such an advocate of information standards in general and markup languages in particular. We discuss a simple example of the power of markup, and throughout the chapter, I cover some basic definitions and concepts.
The Lesson of SGML
In early 1995, I helped start a company, E-Doc, with a subversive business plan based on the premise that big publishing companies (in this case, in the scientific-technical-medical arena) might want to publish on the World Wide Web. I say "subversive" because at the time it was just thatthe very companies we were targeting with our services were the old guard of the publishing world, and they had every reason in the world to suppress and reject these new technologies. A revolution was already occurring, especially in the world of scientific publishing. Through the Internet, scientists were beginning to share papers with other scientists. While the publishing companies weren't embracing this new medium, the scientists themselves were, and in the process they were bypassing traditional journal publication entirely and threatening decades of entrenched academic practice. Remember, the Internet wasn't seen as a viable commercial medium back then; it was largely used by academics, although we were starting to hear about the so-called "information superhighway." Despite the assurance of all my friends that I was off my rocker, I left my secure career in the client/server software industry to follow my nose into the unknown. In my two years at E-Doc, I learned a great deal about technology, media, business, and the publishing industry, but one lesson that stands out is the power of SGML.
An international standard since 1986, SGML (Standard Generalized Markup Language) is the foundation on which modern markup languages (such as HTML or Hypertext Markup Language, the language of the Web) are based. SGML defines a structure through which markup languages can be built. HTML is a flavor of SGML, but it is only one markup language (and not even a particularly complex one) that derives from SGML. Since its inception, SGML has been in use in publishing, as well as in industry and governments throughout the world.
Because many of the companies we were dealing with at E-Doc had been using flavors of SGML to encode material such as books and journal articles since the late 1980s, they had developed vast storehouses of SGML data that was just waiting for the Internet revolution. Setting up full-text Web publishing systems became a matter of simply translating these already existing SGML files. It's not that the decision makers at these companies were so forward-thinking that they knew a global network that would redefine the way we think about information would soon develop. The lesson of SGML was precisely that these decision makers did not know what the future would hold. Using SGML "future-proofed" their data so that when the Web came around, they could easily repurpose it for their changing needs.
It's been a wild ride over the past six years, but as we begin a new century and a new millennium, that idea of future-proofing data seems more potent and relevant than ever. The publishing industry will continue to transform and accelerate into new areas, new platforms, and new paradigms. As technology professionals, we have to start thinking about future-proofing now, while we're still at the beginning of this revolution.