2.3 Defining the Model
Let's put aside the philosophy for now and take a closer look at what we're really describing with an EMF model. We saw in Section 2.1 that our conceptual model could be defined in several different ways; that is, in Java, UML, or XML Schema. But, what exactly are the common concepts we're talking about when describing a model? Let's look at our purchase order example again. Recall that our simple model included the following:
- PurchaseOrder and Item, which in UML and Java map to class definitions, but in XML Schema map to complex type definitions.
- shipTo, billTo, productName, quantity, and price, which map to attributes in UML, get()/set() method pairs (or bean properties, if you want to look at it that way) in Java, and in the XML Schema are nested element declarations.
- items, which is a UML association end or reference, a get() method in Java, and in XML Schema, a nested element declaration of another complex type.
As you can see, a model is described using concepts that are at a higher level than simple classes and methods. Attributes, for example, represent pairs of methods, and as you'll see when we look deeper into the EMF implementation, they also have the ability to notify observers (e.g., UI views) and be saved to, and loaded from, persistent storage. References are more powerful yet, because they can be bidirectional, in which case referential integrity is maintained. References can also be persisted across multiple resources (documents), where demand load and proxy resolution come into play.
To define a model using these kinds of "model parts" we need a common terminology to describe them. More important, to implement the EMF tools and generator, we also need a model for the information. We need a model for describing EMF models; that is, a metamodel.
2.3.1 The Ecore (Meta) Model
The model used to represent models in EMF is called Ecore. Ecore is itself an EMF model, and thus is its own metamodel. You could say that makes it a meta-metamodel. People often get confused when talking about meta-metamodels (metamodels in general, for that matter), but the concept is actually quite simple. A metamodel is simply the model of a model, and if that model is itself a metamodel, then the metamodel is in fact a meta-metamodel.4 Got it? If not, don't worry about it, as it's really just an academic issue anyway.
A simplified subset of the Ecore metamodel is shown in Figure 2.3. This diagram only shows the parts of Ecore needed to describe our purchase order example, and we've taken the liberty of simplifying it a bit to avoid showing base classes. For example, in the real Ecore metamodel the classes EClass, EAttribute, and EReference share a common base class, ENamedElement, which defines the name attribute that here we've shown explicitly in the classes themselves.
Figure 2.3 A simplified subset of the Ecore metamodel.
As you can see, there are four Ecore classes needed to represent our model:
- EClass is used to represent a modeled class. It has a name, zero or more attributes, and zero or more references.
- EAttribute is used to represent a modeled attribute. Attributes have a name and a type.
- EReference is used to represent one end of an association between classes. It has a name, a boolean flag to indicate if it represents containment, and a reference (target) type, which is another class.
- EDataType is used to represent the type of an attribute. A data type can be a primitive type like int or float or an object type like java.util.Date.
Notice that the names of the classes correspond most closely to the UML terms. This is not surprising because UML stands for Unified Modeling Language. In fact, you might be wondering why UML isn't "the" EMF model. Why does EMF need its own model? Well, the answer is quite simply that Ecore is a small and simplified subset of full UML. Full UML supports much more ambitious modeling than the core support in EMF. UML, for example, allows you to model the behavior of an application, as well as its class structure. We'll talk more about the relationship of EMF to UML and other standards in Section 2.6.
We can now use instances of the classes defined in Ecore to describe the class structure of our application models. For example, we describe the purchase order class as an instance of EClass named "PurchaseOrder". It contains two attributes (instances of EAttribute that are accessed via eAttributes) named "shipTo" and "billTo", and one reference (an instance of EReference that is accessed via eReferences) named "items", for which eReferenceType (its target type) is equal to another EClass instance named "Item". These instances are shown in Figure 2.4.
Figure 2.4 The purchase order Ecore instances.
When we instantiate the classes defined in the Ecore metamodel to define the model for our own application, we are creating what we call an Ecore model.
2.3.2 Creating and Editing the Model
Now that we have these Ecore objects to represent a model in memory, EMF can read from them to, among other things, generate implementation code. You might be wondering, though, how do you create the model in the first place? The answer is that you need to build it from whatever input form you start with. If you start with Java interfaces, EMF will introspect them and build the Ecore model. If, instead, you start with an XML Schema, then the model will be built from that. If you start with UML, there are three possibilities:
- Direct Ecore editing. EMF includes a simple tree-based sample editor for Ecore. If you'd rather use a graphical tool, the Ecore Tools project5 provides a graphical Ecore editor based on UML notation. Third-party options are also available, including Topcased's Ecore Editor (http://www.topcased.org/), Omondo's EclipseUML (http://www.omondo.com/) and Soyatec's eUML (http://www.soyatec.com/).
- Import from UML. The EMF Project and EMF Model wizards provide an extensible framework, into which model importers can be plugged, supporting different model formats. EMF provides support for Rational Rose (.mdl files) only. The reason Rose has this special status is because it's the tool that was used to "bootstrap" the implementation of EMF itself. The UML2 project6 also provides a model importer for standard UML 2.x models.
- Export from UML. This is similar to the second option, but the conversion support is provided exclusively by the UML tool. It is invoked from within the UML tool, instead of from an EMF wizard.
As you might imagine, the first option is the most desirable. With it, there is no import or export step in the development process. You simply edit the model and then generate. Also, unlike the other options, you don't need to worry about the Ecore model being out of sync with the tool's own native model. The other two approaches require an explicit reimport or reexport step whenever the UML model changes.
The advantage of the second and third options is that you can use the UML tool to do more than just your EMF modeling. You can use the full power of UML and whatever fancy features the particular tool has to offer. If it supports its own code generation, for example, you can use the tool to define your Ecore model, and also to both define and generate other parts of your application. As long as a mechanism for conversion to Ecore is provided, that tool will also be usable as an input source for EMF and its generator.
2.3.3 XMI Serialization
By now you might be wondering what the serialized form of an Ecore model is. Previously, we've observed that the "conceptual" model is represented in at least three physical places: Java code, XML Schema, or a UML diagram. Should there be just one form that we use as the primary, or standard, representation? If so, which one should it be?
Believe it or not, we actually have yet another (i.e., a fourth) persistent form that we use as the canonical representation: XML Metadata Interchange (XMI). Why did we need another one? We weren't exactly short of ways to represent the model persistently.
For starters, Java code, XML Schema, and UML all carry additional information beyond what is captured in an Ecore model. Moreover, none of these forms is required in every scenario in which EMF can be used. Java code was the only one required in our running example, but as we will soon see, even it is optional in some cases. So, what we need is a direct serialization of Ecore, which doesn't add any extra information. XMI fits the bill here, as it is a standard for serializing metadata concisely using XML.
Serialized as an Ecore XMI file, our purchase order model looks something like this:
<?xml version="1.0" encoding="UTF-8"?> <ecore:EPackage xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ecore="http://www.eclipse.org/emf/2002/Ecore" name="po" nsURI="http://www.example.com/SimplePO" nsPrefix="po"> <eClassifiers xsi:type="ecore:EClass" name="PurchaseOrder"> <eStructuralFeatures xsi:type="ecore:EAttribute" name="shipTo" eType="ecore:EDataType http://www.eclipse.org/emf/2002/Ecore#//EString"/> <eStructuralFeatures xsi:type="ecore:EAttribute" name="billTo" eType="ecore:EDataType http://www.eclipse.org/emf/2002/Ecore#//EString"/> <eStructuralFeatures xsi:type="ecore:EReference" name="items" upperBound="-1" eType="#//Item" containment="true"/> </eClassifiers> <eClassifiers xsi:type="ecore:EClass" name="Item"> <eStructuralFeatures xsi:type="ecore:EAttribute" name="productName" eType="ecore:EDataType http://www.eclipse.org/emf/2002/Ecore#//EString"/> <eStructuralFeatures xsi:type="ecore:EAttribute" name="quantity" eType="ecore:EDataType http://www.eclipse.org/emf/2002/Ecore#//EInt"/> <eStructuralFeatures xsi:type="ecore:EAttribute" name="price" eType="ecore:EDataType http://www.eclipse.org/emf/2002/Ecore#//EFloat"/> </eClassifiers> </ecore:EPackage>
Notice that the XML elements correspond directly to the Ecore instances back in Figure 2.4, which makes perfect sense because this is a serialization of exactly those objects. Here we've hit an important point: because Ecore meta-data is not the same as, for example, UML metadata, XMI serializations of the two are not the same either.7
2.3.4 Java Annotations
Let's revisit the issue of defining an Ecore model using Java interfaces. Previously we implied that when provided with ordinary Java interfaces, EMF "would" introspect them and deduce the model properties. That's not exactly the case. The truth is that given interfaces containing standard get() methods,8 EMF could deduce the model attributes and references. EMF does not, however, blindly assume that every interface and method in it is part of the model. The reason for this is that the EMF generator is a code-merging generator. It generates code that not only is capable of being merged with user-written code, it's expected to be.
Because of this, our PurchaseOrder interface isn't quite right for use as a model definition. First of all, the parts of the interface that correspond to model elements whose implementation should be generated need to be indicated. Unless explicitly marked with an @model annotation in the Javadoc comment, a method is not considered to be part of the model definition. For example, interface PurchaseOrder needs the following annotations:
/** * @model */ public interface PurchaseOrder { /** * @model */ String getShipTo(); /** * @model */ String getBillTo(); /** * @model type="Item" containment="true" */ List getItems(); }
Here, the @model tags identify PurhaseOrder as a modeled class, with two attributes, shipTo and billTo, and a single reference, items. Notice that both attributes, shipTo and billTo, have all their model information available through Java introspection; that is, they are simple attributes of type String. No additional model information appears after their @model tags, because only information that is different from the default needs to be specified.
There is some non-default model information needed for the items reference. Because the reference is multiplicity-many, indicated by the fact that getItems() returns a List, we need to specify the target type of the reference as type="Item".9 We also need to specify containment="true" to indicate that we want purchase orders to be a container for their items and serialize them as children.
Notice that the setShipTo() and setBillTo() methods are not required in the annotated interface. With the annotations present on the get() method, we don't need to include them; once we've identified the attributes (which are settable by default), the set() methods will be generated and merged into the interface if they're not already there.
2.3.5 The Ecore "Big Picture"
Let's recap what we've covered so far.
- Ecore, and its XMI serialization, is the center of the EMF world.
- An Ecore model can be created from any of at least three sources: a UML model, an XML Schema, or annotated Java interfaces.
- Java implementation code and, optionally, other forms of the model can be generated from an Ecore model.
We haven't talked about it yet, but there is one important advantage to using XML Schema to define a model: given the schema, instances of the model can be serialized to conform to it. Not surprisingly, in addition to simply defining the model, the XML Schema approach is also specifying something about the persistent form of the instances.
One question that comes to mind is whether there are other persistent model forms possible. Couldn't we, for example, provide a relational database (RDB) Schema and produce an Ecore model from it? Couldn't this RDB Schema also be used to specify the persistent format, similar to the way XML Schema does? The answer is, quite simply, yes. This is one type of function that EMF is intended to support, and certainly not the only kind. The "big picture" is shown in Figure 2.5.
Figure 2.5 An Ecore model and its sources.