1.7 Values vs. References
Let's run a simple experiment:
import std.stdio; struct MyStruct { int data; } class MyClass { int data; } void main() { // play with a MyStruct object MyStruct s1; MyStruct s2 = s1; ++s2.data; writeln(s1.data); // prints 0 // play with a MyClass object MyClass c1 = new MyClass; MyClass c2 = c1; ++c2.data; writeln(c1.data); // prints 1 }
It seems like playing with a MyStruct is quite a different game than playing with a MyClass. In both cases we create a variable that we copy into another variable, after which we modify the copy (assuming '++' is modifying). The experiment seems to reveal that after a copy, c1 and c2 refer to the same underlying storage, while on the contrary, s1 and s2 have independent lives.
The behavior of MyStruct obeys value semantics: each variable refers to exactly one value, and assigning one variable to another really means copying the state of the variable over the state of the other variable. The source of the copy is unchanged, and the two variables continue to evolve independently. The behavior of MyClass obeys reference semantics: values are created explicitly (in our case by invoking new MyClass) and assigning a class variable to another simply means that the two variables refer to the same value.
Value semantics are easy to deal with, simple to reason about, and allow efficient implementation for small sizes. On the other hand, nontrivial programs are difficult to implement without some means to refer to a value without copying it. Value semantics alone preclude, for example, forming self-referential types (lists or trees), or mutually-referential structures such as a child window knowing about its parent window. Any serious language implements some sort of reference semantics; it could be argued that it all depends on where the default is. C has value semantics exclusively and allows forming references explicitly, by means of pointers. In addition to pointers, C++ also defines reference types. Interestingly, pure functional languages are at freedom to use reference or value semantics as they find fit, because user code cannot tell the difference. This is because pure functional languages don't allow mutation, so you can't tell if they snuck a copy of a value or just a reference to it—it's frozen anyway, so you couldn't verify whether it's shared by changing it. On the contrary, pure object-oriented languages are traditionally mutation-intensive and employ reference semantics exclusively, some to the extent of allowing a disconcerting amount of flexibility such as changing system-wide constants dynamically. Finally, some languages take a hybrid approach, embracing both value and reference types, with various levels of commitment.
D makes a systematic approach to the hybrid method. To define reference types you use class. To define value types or hybrid types you use struct. As chapters ?? and ?? (respectively) describe in detail, each of these type constructors is endowed with amenities specific to this fundamental design choice. For example, structs do not have support for dynamic inheritance and polymorphism (the kind we've shown in the statistical program above), as such behaviors are not compatible with value semantics. Dynamic polymorphism of objects needs reference semantics, and any attempt to mess with that can only lead to terrible accidents. (For example, a common danger to watch for in C++ is slicing, i.e. suddenly stripping the polymorphic abilities of an object when inadvertently using it as a value. In D, slicing could never occur.)
A closing thought is that structs are arguably a more flexible design choice. By defining a struct, you can tap into any semantics that you want, be it eager-copy value, lazy copying à la copy-on-write or reference counting, or anything in between. You can even define reference semantics by using class objects or pointers inside your struct object. On the other hand, some of these stunts may require quite advanced technical savvy; in contrast, using classes offers simplicity and uniformity across the board.