A First Look at XML
The idea behind XML is deceptively simple. It aims at answering the conflicting demands that arrive at the W3C for the future of HTML.
On the one hand, people need more tags. And these new tags are increasingly specialized. For example, mathematicians want tags for formulas. Chemists also want tags for formulas but they are not the same.
On the other hand, authors and developers want fewer tags. HTML is already so complex! As handheld devices gain in popularity, the need for a simpler markup language also is apparent because small devices, like the PalmPilot, are not powerful enough to process HMTL pages.
How can you have both more tags and fewer tags in a single language? To resolve this dilemma, XML makes essentially two changes to HTML:
-
It predefines no tags.
-
It is stricter.
No Predefined Tags
Because there are no predefined tags in XML, you, the author, can create the tags that you need. Do you need a tag for price? Do you need a tag for a bold hyperlink that floats on the right side of the screen? Make them:
<price currency="usd">499.00</price> <toc xlink:href="/newsletter">Pineapplesoft Link</toc>
The <price> tag has no equivalent in HTML although you could simulate the <toc> tag through a combination of table, hyperlink, and bold:
<TABLE> <TR> <TD><!-- main text here --></TD> <TD><A HREF="/newsletter"><B>Pineapplesoft Link</B></A></TD> </TR> </TABLE>
This is the X in XML. XML is extensible because it predefines no tags but lets the author create tags that are needed for his or her application.
This is simple but it opens many questions such as:
-
how does the browser knows that <toc> is equivalent to this combination of table, hyperlink, and bold?
-
can you compare different prices?
-
what about the current generation of browsers?
-
how does this simplify Web site maintenance?
For detailed answers to these questions, you should read an introductory XML book such as XML by Example. Briefly the answers are
-
the browsers uses a style sheet
-
you can compare prices
-
XML can be made compatible with the current generation of browsers
-
XML enables you to concentrate on more stable aspects of your document
Stricter
HTML has a very forgiving syntax. This is great for authors who can be as lazy as they want, but it also makes Web browsers more complex. According to some estimates, more than 50% of the code in a browser handles errors or sloppiness on the author's part.
However authors increasingly use HMTL editors so they don't really care how simple and forgiving the syntax is.
Yet browsers are growing in size and are becoming generally slower. The speed factor is a problem for every surfer. The size factor is a problem for owners of handheld devices who cannot afford to download 10Mb browsers.
Therefore it was decided that XML would adopt a very strict syntax. A strict syntax results in smaller, faster, and lighter browsers.