The Oslo Modeling Language: An Introduction to "M"
The “Oslo” Modeling Language (M) is a modern, declarative language for working with data. M lets users write down how they want to structure and query their data using a convenient textual syntax that is convenient to both author and read.
M does not mandate how data is stored or accessed, nor does it mandate a specific implementation technology. Rather, M was designed to allow users to write down what they want from their data without having to specify how those desires are met against a given technology or platform. That stated, M in no way prohibits implementations from providing rich declarative or imperative support for controlling how M constructs are represented and executed in a given environment.
M builds on three basic concepts: values, types, and extents. Here’s how M defines these three concepts:
- A value is simply data that conforms to the rules of the M language.
- A type describes a set of values.
- An extent provides dynamic storage for values.
In general, M separates the typing of data from the storage/extent of the data. A given type can be used to describe data from multiple extents as well as to describe the results of a calculation. This allows users to start writing down types first and decide where to put or calculate the corresponding values later.
On the topic of determining where to put values, the M language does not specify how an implementation maps a declared extent to an external store such as an RDBMS. However, M was designed to make such implementations possible and is compatible with the relational model.
Another important aspect of data management that M does not address is that of update. M is a functional language that does not have constructs for changing the contents of an extent. How data changes is outside the scope of the language. That said, M anticipates that the contents of an extent can change via external (to M) stimuli. Subsequent versions of M are expected to provide declarative constructs for updating data.
This chapter provides a non-normative introduction to the fundamental concepts in M. Chapters 2-6 provide the normative definition of the language.
1.1 Values
The easiest way to get started with M is to look at some values. M has intrinsic support for constructing values. The following is a legal value in M:
"Hello, world"
The quotation marks tell M that this is the text value Hello, world. M literals can also be numbers. The following literal:
1
is the numeric value one. Finally, there are two literals that represent logical values:
true false
We’ve just seen examples of using literals to write down textual, numeric, and logical values. We can also use expressions to write down values that are computed.
An M expression applies an operator to zero or more operands to produce a result. An operator is either a built-in operator (e.g., +) or a user-defined function (which we look at in Section 1.2.5). An operand is a value that is used by the operator to calculate the result of the expression, which is itself a value. Expressions nest, so the operands themselves can be expressions.
M defines two equality operators: equals, ==, and not equals, !=, both of which result in either true or false based on the equivalence/nonequivalence of the two operands. Here are some expressions that use the equality operators:
1 == 1 "Hello" != "hELLO" true != false
All of these expressions will yield the value true when evaluated.
M defines the standard four relational operators, less-than <, greater-than >, less-than-or-equal <=, and greater-than-or-equal >=, which work over numeric and textual values. M also defines the standard three logical operators: and &&, or ||, and not ! that combine logical values.
The following expressions show these operators in action:
1 < 4 1 == 1 1 < 4 != 1 > 4 !(1 + 1 == 3) (1 + 1 == 3) || (2 + 2 < 10) (1 + 1 == 2) && (2 + 2 < 10)
Again, all of these expressions yield the value true when evaluated.
1.1.1 Collections
All of the values we saw in the previous section were simple values. In M, a simple value is a value that has no uniform way to be decomposed into constituent parts. While there are textual operators that allow you to extract substrings from a text value, those operators are specific to textual data and don’t work on numeric data. Similarly, any “bit-level” operations on binary values don’t apply to text or numeric data.
An M collection is a value that groups together zero or more elements that themselves are values. We can write down collections in expressions using an initializer, { }.
The following expressions each use an initializer to yield a collection value:
{ 1, 2 } { 1 } { }
As with simple values, the equivalence operators == and != are defined over collections. In M, two collections are considered equivalent if and only if each element has a distinct equivalent element in the other collection. That allows us to write the following equivalence expressions:
{ 1, 2 } == { 1, 2 } { 1, 2 } != { 1 }
both of which are true.
The elements of a collection can consist of different kinds of values:
{ true, "Hello" }
and these values can be the result of arbitrary calculation:
{ 1 + 2, 99 – 3, 4 < 9 }
which is equivalent to the collection:
{ 3, 96, true }.
The order of elements in a collection is not significant. That means that the following expression is also true:
{ 1, 2 } == { 2, 1 }
Finally, collections can contain duplicate elements, which are significant. That makes the following expression:
{ 1, 2, 2 } != { 1, 2 }
also true.
M defines a set of built-in operators that are specific to collections. The most important is the in operator, which tests whether a given value is an element of the collection. The result of the in operator is a logical value that indicates whether the value is or is not an element of the collection. For example, these expressions:
1 in { 1, 2, 3 } !(1 in { "Hello", 9 })
both result in true.
M defines a Count member on collections that calculates the number of elements in a collection. This use of that operator:
{ 1, 2, 2, 3 }.Count
results in the value 4. The postfix # operator returns the count of a collection, so
{ 1, 2, 2, 3 }# == { 1, 2, 2, 3 }.Count
returns true.
As noted earlier, M collections may contain duplicates. You can apply the Distinct member to get a version of the collection with any duplicates removed:
{ 1, 2, 3, 1 }.Distinct == { 1, 2, 3 }
The result of Distinct is not just a collection but is also a set, that is, a collection of distinct elements.
M also defines set union "|" and set intersection "&" operators, which also yield sets:
({ 1, 2, 3, 1 } | { 1, 2, 4 }) == { 1, 2, 3, 4 } ({ 1, 2, 3, 1 } & { 1, 2, 4 }) == { 1, 2 }
Note that union and intersection always return collections that are sets, even when applied to collections that contain duplicates.
M defines the subset and superset using <= and >=. Again these operations convert collections to sets. The following expressions evaluate to true.
{ 1, 2 } <= { 1, 2, 3 } { "Hello", "World" } >= { "World" } { 1, 2, 1 } <= { 1, 2, 3 }
Arguably the most commonly used collection operator is the where operator. The where operator applies a logical expression (called the predicate) to each element in a collection (called the domain) and results in a new collection that consists of only the elements for which the predicate holds true. To allow the element to be used in the predicate, the where operator introduces the symbol value to stand in for the specific element being tested.
For example, consider this expression that uses a where operator:
{ 1, 2, 3, 4, 5, 6 } where value > 3
In this example, the domain is the collection { 1, 2, 3, 4, 5, 6 } and the predicate is the expression value > 3. Note that the identifier value is available only within the scope of the predicate expression. The result of this expression is the collection { 4, 5, 6 }.
M supports a richer set of query comprehensions using a syntax similar to that of Language Integrated Query (LINQ). For example, the where example just shown can be written in long form as follows:
from value in { 1, 2, 3, 4, 5, 6 } where value > 3 select value
In general, M supports the LINQ operators with these significant exceptions:
- ElementAt/First/Last/Range/Skip are not supported—M collections are unordered and do not support positional access to elements.
- Reverse is not supported—Again, position is not significant on M collections.
- Take/TakeWhile/Single—These operators do not exist in M.
- Choose—This selects an arbitrary element.
- ToArray/ToDictionary/ToList—There are no corresponding CLR types in M.
- Cast-typing works differently in M—You can achieve the same effect using a where operator.
While the where operator allows elements to be accessed based on a calculation over the values of each element, there are situations where it would be much more convenient to simply assign names to each element and then access the element values by its assigned name. M defines a distinct kind of value called an entity for just this purpose.
1.1.2 Entities
An entity consists of zero or more name-value pairs called fields. Entities can be constructed in M using an initializer. Here’s a simple entity value:
{ X = 100, Y = 200 }
This entity has two fields: one named X with the value of 100, the other named Y with the value of 200.
Entity initializers can use arbitrary expressions as field values:
{ X = 50 + 50, Y = 300 - 100 }
And the names of members can be arbitrary Unicode text:
{ [Horizontal Coordinate] = 100, [Vertical Coordinate] = 200 }
If the member name matches the Identifier pattern, it can be written without the surrounding [ ]. An identifier must begin with an upper or lowercase letter or "_" and be followed by a sequence of letters, digits, "_", and "$".
Here are a few examples:
HelloWorld = 1 // matches the Identifier pattern [Hello World] = 1 // doesn't match identifier pattern _HelloWorld = 1 // matches the Identifier pattern A = 1 // matches the Identifier pattern [1] = 1 // doesn't match identifier pattern
It is always legal to use [ ] to escape symbolic names; however, most of the examples in this document use names that don’t require escaping and, therefore, do not use escaping syntax for readability.
M imposes no limitations on the values of entity members. It is legal for the value of an entity member to refer to another entity:
{ TopLeft = { X = 100, Y = 200 }, BottomRight = { X = 400, Y = 100 } }
or a collection:
{ LotteryPicks = { 1, 18, 25, 32, 55, 61 }, Odds = 0.00000001 }
or a collection of entities:
{ Color = "Red", Path = { { X = 100, Y = 100 }, { X = 200, Y = 200 }, { X = 300, Y = 100 }, { X = 300, Y = 100 }, } }
This last example illustrates that entity values are legal for use as elements in collections.
Entity initializers are useful for constructing new entity values. M defines the dot, “.”, operator over entities for accessing the value of a given member. For example, this expression:
{ X = 100, Y = 200 }.X
yields the value of the X member, which in this case is 100. The result of the dot operator is just a value that is subject to subsequent operations. For example, this expression:
{ Center = { X = 100, Y = 200 }, Radius = 3 }.Center.Y
yields the value 200.