- Why the UML Is Both a Very Good Thing and a Lost Opportunity
- The Lost Opportunity
- The Bottom Line
The Lost Opportunity
Unfortunately, the Unified Modeling Language has failed in one of its main objectives. In fact, I would argue that it fails in the very area that should be its greatest promise, that of notation.
Graphical Notations: Why the Big Deal?
A design notation is intended as a means of communication. It is a lingua franca between engineers, ensuring unambiguous transfer of design ideas. Graphical notations are ideal for this in a number of ways:
-
Graphical notations transcend language boundaries. If an English designer draws a circuit diagram, a non–English-speaking electronics engineer can read that diagram and understand it.
-
Graphical notations scale. You can cram an awful lot of elements into a small space using well-designed graphical notations. For example, it's not uncommon to find electronic designs with 50 or more components in a diagram on an A4-sized piece of paper. One key reason for this is that "a picture paints a thousand words"—you can put an awful lot of semantic meaning into a well-designed icon. To express the same amount of information in words takes up much more space.
-
Graphical notations aid the identification of patterns. The eye/brain combination is very good at identifying visual patterns. Even in a complex diagram, the eye can recognize familiar graphical combinations.
Why UML Is Not a Graphical Notation
But UML is a graphical notation, I hear you cry! It's true that UML has many graphical elements, but I would argue that in some key areas it falls short:
-
Reliance on textual annotations—Many adornments rely on textual descriptions—for example, the include/extend on Use case diagrams or the {ordered} constraint on an association. There is no reason why graphical icons could not have been used for these. It's not just a case of not being able to read the words because of language differences—although that's an extra potential hazard. It is about the ability to scan and comprehend quickly what the diagram is trying to communicate.
-
Reliance on stereotypes—This reliance on text really starts to strike home when we consider the different types of classes that all use the same icon. Compare transistors in electronic engineering—there are at least 20 different, although similar, icons for a transistor, reflecting the different characteristics. Just a few of these are shown in Figure 1.
Some icons for a transistor
Why not do something similar for the different types of classes—for example, why not distinguish graphically between:
-
Meta classes
-
Abstract classes
-
Factory classes
-
Class utilities
-
Mix-ins
-
Nonstandard graphical icons via stereotypes—Ironically, one of the features of UML often held to be a strength is, in fact, a potential weakness: the ability to use stereotypes to change the icon representing an entity. This is most usually (and most sensibly) applied to physical deployment diagrams, where the default "node" icon is replaced by a clip art representation of the node—a Cisco router, a Sun server, and so on. This makes some sense. However, taking this further, there is nothing to stop a designer supplying his or her own icons for things such as classes, states, and use cases. This would render the notation completely unreadable to anyone unfamiliar with the changes, defeating the purpose of a standard notation. (The reason for allowing these changes is to permit extension of the notation to cover new requirements, a laudable but flawed goal.)
-
Overloading of symbols within a single diagram type—Components are probably the best (or should that be worst?) example of this. In a single diagram, a component symbol can represent a header file, an implementation file, a library, a folder containing multiple libraries, and so on. In another diagram context, the same symbol can represent a COM object (with interface adornments) or a logical subsystem or a class category (although many of these are better represented by packages rather than components, but the local confusion remains).
Nodes in a deployment diagram suffer from the same fate. Everything from a mainframe to a PC to a router has the single square box node icon. Here stereotypes are recommended to differentiate, but once again we must rely on reading text to understand exactly what we are looking at. The alternative involves defining our own set of icons, which others may or may not understand. Worse still, they may think that they understand them but in fact make the wrong assumptions, which is exactly what a formal design notation is supposed to prevent!
What's Missing?
The deficiencies described are not the only areas where the UML is weak. UML is one of the few software design notations to even consider the physical aspects of the design process, so it might seem unfair to criticize this aspect, yet it is another case of being so close and yet not quite hitting the target. Two notable gaps exist in the design notation for describing the physical aspects of a system, as discussed next.
Process Diagrams
Grady Booch, in his original Object-Oriented Design book, illustrated the different aspects of design with a box cut into four quadrants. These quadrants represent logical and physical design, each of which had static and dynamic elements. While UML provides notational elements for static and dynamic aspects of the logical design, and some support for static physical design, the notation is almost completely lacking in any form of representation for dynamic physical design. In particular, the most dynamic of physical elements, the process, is missing entirely from the notation.
It possible to work around this in various ways, but nonetheless, it is a significant lapse. How does one describe interactions with the operating system services or daemons? What about replication of processes as an element of load balancing? Indeed, what about interprocess messaging both between nodes and within a single node? It is possible to represent these things using objects in an interaction diagram, but these rightly belong to the logical domain and indeed are often not even recognizable objects within the design proper. Finally, there are different types of processes to be represented, such as daemons, scheduled batch jobs, one-shot utilities, and so on. Even using objects to represent processes does not really address these kinds of environmental differences because the same executable could be run as either a batch job or a one-shot process or both, all within the same system.
Network Variations
Most software engineers represent networks as lines connecting two nodes or as a fluffy cloud. In the real world, especially in the sphere of systems integration, networks come in many forms with distinct characteristics:
-
Circuit-switched
-
Packet-switched
-
Virtual circuits
-
Analog
-
Digital
These characteristics make a big difference in the way that a large system operates and is constructed. Indeed, the very fact that multiple network types exist within a system and that these networks are not natively compatible is an issue worth highlighting in a design. It would be nice if both point-to-point network links and whole network "clouds" had distinct representations in UML showing the nature of the link. Whether an ethernet LAN is using a hub or a switch is likely to make a significant difference to how the system will perform under heavy load—probably a much bigger difference than how the code is written! At present, we have only a single node icon and some lines to represent all aspects of the hardware and network deployment.