- 1.1 An Increasing Number of Languages
- 1.2 Software Languages
- 1.3 The Changing Nature of Software Languages
- 1.4 The Complexity Crisis
- 1.5 What We Can Learn From ...
- 1.6 Summary
1.3 The Changing Nature of Software Languages
When you want to create software languages, you need to have a clear picture of their character. Over the past two decades, the nature of software languages has changed in at least two important ways. First, software is increasingly being built using graphical (visual) languages instead of textual ones. Second, more and more languages have multiple syntaxes.
1.3.1 Graphical versus Textual Languages
There is an important difference between graphical and textual languages. In the area of sound, a parallel to this difference is clearly expressed in the following quote contrasting the use of spoken word with music.
- It is helpful to compare the linear structure of text with the flow of musical sounds. The mouth as the organ of speech has rather limited abilities. It can utter only one sound at a time, and the flow of these sounds can be additionally modulated only in a very restricted manner, e.g., by stress, intonation, etc. On the contrary, a set of musical instruments can produce several sounds synchronously, forming harmonies or several melodies going in parallel. This parallelism can be considered as nonlinear structuring. The human had to be satisfied with the instrument of speech given to him by nature. This is why we use while speaking a linear and rather slow method of acoustic coding of the information we want to communicate to somebody else. [Bolshakov and Gelbukh 2004]
In the same manner, we can distinguish between textual and graphical software languages. An expression in a textual language has a linear structure, whereas an expression in a graphical language has a more complex, parallel, nonlinear structure. Each sentence in a textual language is a series of symbols—or tokens as they are called in the field of parser design—juxtaposed. In graphical languages, symbols, such as a rectangle or an arrow, also are the basic building blocks of a graphical expression. The essential difference between textual and graphical languages is that in graphical languages, the symbols can be connected in more than one way.
To exemplify this, I have recreated a typical nonlinear expression in a linear manner. Figure 1-2(a) shows what a Unified Modeling Language (UML) class diagram would need to look like when being expressed in a linear fashion. The normal, nonlinear way of expressing the same meaning is shown in Figure 1-2(b). The problem with the linear expression is that the same object (Company) needs to appear twice because it cannot be connected to more than one other element at the same time. This means that you need to reconcile the two occurrences of that object, because both occurrences represent the same thing.
Figure 1-2 A linear and a nonlinear expression
Note that many languages have a hybrid textual/graphical syntax. For instance, the notation for attributes and operations in a UML class diagram is a textual syntax embedded in a graphical one.
The traditional theory of computer languages—compiler technology—is focused on textual languages. Therefore, without losing the valuable knowledge gained in this area, we need to explore other paths that lead toward the creation of graphical languages.
1.3.2 Multiple Syntaxes
Another aspect of current-day software languages is the fact that they often have multiple (concrete) syntaxes. The mere fact that many languages have a separate interchange format (often XML based) means that they have both a normal syntax and an interchange syntax. At the same time, there is a growing need for languages that have both a graphical and a textual syntax, as shown by such tools as TogetherJ, which uses UML as a graphical notation for Java. The UML itself is a good example of multisyntax language. There is the well-known UML diagram notation [OMG-UML Superstructure 2005], which is combined with the Human-readable UML Textual Notation (HUTN) [OMG-HUTN 2004] and the interchange format called XMI [OMG-XMI 2005].
A natural consequence of multiple syntaxes is that (concrete) syntax cannot be the focus of language design. Every language must have a common representation of a language expression independent of the outward appearance in which it is entered by or shown to the language user. The focus of language design should be on this common representation, which we call abstract syntax.