- UML Profile for XML Schema
- Customizing the PO Schema Design Model
- Creating XML Schemas with hyperModel
- Schema Modularity and Reuse
- Tips for Success
Customizing the PO Schema Design Model
Consider the following sample XML document, which is a fragment from an example in the XSD Primer:
<ipo:purchaseOrder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ipo="http://www.example.com/IPO" orderDate="1999-12-01"> <ipo:shipTo exportCode="1" xsi:type="ipo:UKAddress"> <ipo:name>Helen Zoe</ipo:name> <ipo:street>47 Eden Street</ipo:street> <ipo:city>Cambridge</ipo:city> <ipo:postcode>CB1 1JR</ipo:postcode> </ipo:shipTo> . . . </ipo:purchaseOrder>
We'll use this instance to derive requirements for refining the UML model. These design requirements are divided into four categories:
Should the attributes of a UML class be produced as XML attributes or child elements in the schema?
Which kind of model group (all, sequence, or choice) should be used to validate an element's content?
Should we choose to include or exclude XML element tags that represent class names and roles in the UML associations?
How do we map UML class names to XML element names?
The UML class diagram shown in Figure 1 includes profile extensions that resolve all of these design choices. This purchase order model should be very familiar by now. It was presented as a conceptual model of the vocabulary in the first two articles in this series, and is now refined to include stereotypes and properties that specify the XML schema design model. It's important to note that this is the same structure shown in previous diagrams, with a few additional labels added.
Figure 1 Design model of purchase order vocabulary.
After applying these profile extensions, the following schema is produced for the PurchaseOrder class and its associations:
<xs:element name="purchaseOrder" type="ipo:PurchaseOrder"/> <xs:complexType name="PurchaseOrder"> <xs:sequence> <xs:element name="shipTo" type="ipo:Address"/> <xs:element name="billTo" type="ipo:Address"/> <xs:element name="comment" type="xs:string" minOccurs="0" maxOccurs="1"/> <xs:element name="items" minOccurs="0" maxOccurs="1"> <xs:complexType> <xs:sequence> <xs:element ref="ipo:item" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="orderDate" type="xs:date"/> </xs:complexType>
The sample purchase order instance document includes two XML attributes: orderDate on the purchaseOrder element, and exportCode on the shipTo element. By assigning an <<XSDattribute>> stereotype to the orderDate attribute in UML, we specify that it should be represented as an attribute in XML. The exportCode attribute is similarly stereotyped on the UKAddress class, although it's not shown here. The comment UML attribute in the PurchaseOrder class follows the default mapping to an element in the schema.
The XSD Primer uses a <sequence> model group for all complexType content, whereas the default UML mapping uses an <all> unordered content model. To modify this mapping, we assign the <<XSDcomplexType>> stereotype to the PurchaseOrder class and set the modelGroup property to 'sequence'.
But the use of a sequence model group raises a new issue when mapping from UML to XML schemas. UML attributes and associations are inherently unordered within their owning class. So each UML attribute and association end that is part of a sequence group must be annotated with a profile property that specifies its position. These position property values are shown as annotations in Figure 1. The procedure for adding profile stereotypes and property values is different in each UML tool, although any tool that claims compliance with the UML specification must provide some means for adding them.
The default mapping rules allow an Address element (or one of its subclasses) contained within the association role elements for shipTo and billTo (see Part 2 of this series), whereas the required instance document omits the Address tag and embeds its element and attribute content directly within the role tag. To specify this design choice, the <<XSDelement>> stereotype is assigned to the association ends connected to the Address class, and the anonymousType property is set to 'true'. The stereotype label is omitted from the diagram to minimize clutter, but the tagged value properties are listed within curly braces ({}).
Because the items role on the association to the Item class is not specified as an anonymousType, its definition in the schema shown above retains the role's container element to hold elements for the related class. The document instance for purchase order items looks like this:
<ipo:purchaseOrder> <ipo:items> <ipo:item partNum="833-AA"> <ipo:productName>Lapis necklace</ipo:productName> <ipo:quantity>1</ipo:quantity> <ipo:USPrice>99.95</ipo:USPrice> <ipo:comment>Want this for the holidays!</ipo:comment> <ipo:shipDate>1999-12-05</ipo:shipDate> </ipo:item> </ipo:items> </ipo:purchaseOrder>
In this situation I use the phrase anonymous type with a slightly different, more general meaning than is used in the W3C XML Schema specification. It's easiest to understand the meaning I intend by looking at the UML class diagram in Figure 1 rather than at the XSD Schema document. In the class diagram, if an association end is marked as an anonymousType, the name of the associated class is anonymous when its instances appear in XML documents, regardless of which schema language is actually used to define those documents. The concept of anonymous types is realized differently in different schema languages.
You may have noticed that the XML document elements for purchaseOrder and item appear with a lowercase first character; this is often called "lower camel case" format. However, the default mapping from UML creates these element names equal to the class names, which begin with uppercase letters. The "upper camel case" convention used in the UML diagram is commonly used in object-oriented models and languages, whereas a variety of conventions are followed in current XML schema vocabularies.
This issue is resolved by adding an elementNameMapping property to a UML class along with the <<XSDcomplexType>> stereotype. This profile property allows an XML schema designer to choose a preferred naming convention when modeling the schema details. Like many other profile properties, this value can be set as a default for the entire model so that all class names will be mapped to XML element names in the same way.