- Why Should You Care About Valid XML?
- Data Type Validation--Another Reason for Validity
- How Do You Recognize Data Structure?
- You Can Parse for Validity
- Is Validity All That You Need?
- Summary
- Q&A
Data Type ValidationAnother Reason for Validity
The need for document validity certainly transcends the simple needs shown by the preceding book markup examples. Imagine if you were conducting business-to-business (B2B) transactions over the Internet with a business partner. One of the principal advantages of XML for B2B is that it is platform independent. Therefore, you can do business with someone over the Web regardless of the operating system they run, or the programming and database languages they support. This is only effective, however, if you can guarantee (or perhaps enforce) the structure of the documents you exchange. You might need to ensure not only the structure of the transaction regarding element order and construct, but also regarding data type validation.
Elements and attribute values that are as simple as dates might be prone to error without careful validation. If your business partners are located in another country, they might, by convention, represent date values in a format that is different from your own. All of the date formats shown in Table 3.1 are used somewhere on the planet to represent October 9, 2001. Even though four digit years (including the century) are now more typical by convention, two digit years have been shown here to further highlight the potential for confusion.
Table 3.1 Sample Date Formats from Around the World for October 9, 2001
Format |
Value |
---|---|
mmddyy |
100901 |
ddmmyy |
091001 |
yyddmm |
010910 |
yymmdd |
011009 |
To ensure integrity of data transactions with business partners, you might need to enforce one and only one format in a schema. You can see how the numeric values alone really hold very little information and could be easily misinterpreted, resulting in inaccurate processing.
You will learn to control data types with some of the schema approaches presented in this book, such as XDR and XSD. The DTD schema methodology, on the other hand, does not provide easy mechanisms for validating data types. This is one of the differences between DTD schemata and other approaches. You'll read about other differences on Days 5 and 6. Then, when creating your own schema you will be able to choose the methodology that best suits your own needs.
NOTE
It is possible to describe data types in a limited fashion using notations in conjunction with DTD schemata; however, this approach is not ideal.
Table 3.2 provides a few examples of data types that might require validation, depending upon the nature of information being exchanged. You have probably used data type validation in other forms of programming.
Table 3.2 Examples of Data Types
Data Type |
Description |
---|---|
Boolean |
0 (false) or 1 (true) |
Char |
Single character (for example, "C") |
String |
Series of characters (for example, "CIST 1463") |
Float |
Real number (for example, 123.4567890) |
Int |
Whole number (for example, 5) |
Date |
Formatted as YYYY-MM-DD (for example, 2001-10-09) |
Time |
Formatted as HH-MM-SS (for example, 18:45:00) |
Id |
Text that uniquely identifies an element or attribute |
Idref |
Reference to an id |
Enumeration |
Series of values from which one can be chosen |
Data Typing is used to qualify the nature of particular data elements. Structure in a document is a separate consideration. Structure is a scheme for organizing related pieces of information. The order and placement of elements in XML is often part of the defined structure of the XML instance document. In the next section, you will read about some of the issues related to structure in XML.