XML Schema Replacing DTD
XML documents today are defined through DTDs. Although limited, DTDs have history on its sideits roots are derived from SGML, and it is broadly used. DTDs have served us well in some respects, but there are a number of fundamental problems with DTDs. Here are some limitations, in no particular order:
-
DTDs are non-XML. It's ironic that the definition of an XML document is itself not expressed in XML syntax. This means that DTDs cannot be validated, which is a flaw that is corrected with XML Schema.
-
DTDs are not truly extensible. XML can be extended via DTDs in a very limited fashion (string substitutions).
-
DTDs have limited data type expression. New data types can be defined by using fixed attributes, but it's no simple task.
-
DTDs do not support data relationships. This could be regarded as another limitation of extensibility through DTDs, but the point is that the capability for data element relationships in a data model is an important aspect in EAI transformations.
-
DTDs do not support namespaces. This incapability to support namespaces is limiting.
Besides addressing most of the key limitations of DTDs, XML Schema is first and foremost expressed in XML syntax. Although DTDs are widely used, its nonstandard expression can actually be very confusing to newbies. With XML Schema, anyone who understands XML can construct a schemaand, perhaps more importantly, the definition of the document can be validated.
XML Schema also provides a richer set of data types, supporting dates, integers, and Booleans. This is important because enterprise data is rich and expressed in various data types. For XML to be effectively used, these data types must be supported. This also means improvements in the semantic validation of XML data. For instance, a date can be validated as such instead of being treated as simply a string.
The notion of user-defined type inheritance is another cool feature available through XML Schema. Instead of simple string substitutions, a concept known as archetypes is introduced. Archetypes are like abstract base classes in C++. You can define a base type Employee from which you can define Manager and LineWorker types.
Finally, with XML Schema, you can define data relationships through the use of what is known as attribute groups. This allows for the definition of common attributes that apply to a given set of data elements. Although this can be accomplished with DTDs through the use of parameters, it is not factored by the processor the way attribute groups are.