Home > Articles > Programming > Windows Programming

Interview with Andrei Alexandrescu (Part 2 of 3)

Aug 18, 2010

⎙ Print

Page 1 of 1

Part 2 of this interview about the D programming language finds Eric Niebler and Andrei Alexandrescu deep in discussion about structs versus classes, the difficulties of copy semantics, rvalue references, the intricacies of garbage collection, and Andrei's occasional failure in serving as the standard-bearer for policy-based design.

See Part 1 and Part 3 of this interview.

Like this article? We recommend 

The D Programming Language

Learn More Buy

Eric: D supports both value semantics (structs) and reference semantics (classes). One surprising thing about D's struct/class dichotomy is that the same syntax, t = u, does such radically different things for the two of them: for structs, it copies; for classes, it aliases. Does it hurt generic code and readability in general not to know whether t = u creates independent copies or aliases?

Andrei: Ha—good question! I'd actually rephrase your insight a bit. On the face of it, indeed, "copying" a class variable is really copying a reference, just like in Java, C#, and many other languages, whereas copying a struct variable in D copies the actual value—that is, each and every field. But this apparent dichotomy between reference versus value semantics is really a harmonious relationship between reference and custom semantics.

Structs are very flexible; they can be made to have reference semantics, value semantics, and everything in between. Consider:

// A bona fide D class
class X {
     private int x;
     int method(int y) { return x + y; }
}

// A struct with reference semantics
struct S {
     private X payload;
     ...
}

In the case above, S has de jure value semantics but de facto reference semantics: Copying S objects around really copies references to X objects. This is because S has exactly one field, and that field has reference semantics.

Seen from that perspective, no conflict exists. For some generic algorithm, you specify the expected meaning of each operation (notably that of copying objects around). If you clearly can't expect to work with reference semantics, it's very easy to eliminate classes from the matched subset of the type universe:

void willNotWorkForClasses(T)(T value) if (!is(T == class)) {
     ...
}

The if clause introduces a so-called template constraint. I proposed template constraints about a year ago as a lightweight alternative to concepts, and now I'm happy I did; they've solved a ton of difficult problems very elegantly.

To answer your question: Classes simplify things a fair amount, but structs offer a lot of semantic flexibility (not to mention efficiency). I see their coexistence not as a source of conflict and confusion, but instead as complementary harmony. You must have both in a multi-paradigm language.

Eric: With Mojo in C++[1], you addressed the issue of value semantics with efficient move. C++0x puts move right in the core of the language with rvalue references. D doesn't have them. What alternatives does it offer?

Andrei: The copy semantics as defined by C++98 aged really badly. I'm not saying that as a criticism—it's a very difficult problem! Mojo and other similar mini-frameworks are only palliative solutions to this nuisance. Most other languages chose to stay away from allowing user-definable copy semantics. Case in point: D version 1 chose to allow structs with (shallow) value semantics but not with hookable copying, just like C#.

When we designed D version 2, we were extremely careful to mark a net improvement from C++ in terms of copy semantics. We came up with a very simple and effective system:

All objects are "relocatable"; that is, they can be moved through memory by using bitwise copying, à la memcpy.
As a consequence of (1), the compiler never copies rvalues—it just moves them.
Returning a stack-allocated parameter or a value parameter from a function is a move, not a copy.
The library provides a simple convenience function move() such that move(value) reads value destructively and returns it as a (moved) rvalue.
A struct can define a special hook function called this(this), also known as the postblit function, that the compiler calls against the target immediately after creating a bitwise duplicate of an object. The postblit scales better than the C++ copy constructors because you can add new fields without needing to adjust the postblit.

This system obviates the need for the devilishly difficult copy elision rules in C++ (one of my least favorite parts in that language's definition), and of course doesn't incur the hit of a new type constructor with its own rules and quirks, as rvalue references are. And, boy, are they quirky! Did you know that… bah, I'll save that for another day.

Again, this is not to criticize; I followed the rvalue proposal for C++ quite closely, and at one point I suggested a simpler scheme that Howard Hinnant proved to me would have broken compatibility (which is a no-no). To the credit of their creators, rvalue references achieve most of what they need to do, within an extraordinarily constrained setup. This trick reminds me of a scene in the movie Apollo 13, in which astronauts must build a sui generis air filter out of a plastic bag, a hose, a sock, and whatnot—except that, in the case of rvalues, only some unmentionables were available.

Anyway, getting back to D, the approach has a few more details, but the five points above convey the gist. The system works pretty darned well, is efficient, and requires no intervention (or only minimal intervention) from the user. The only disadvantage is ruling out internal pointers. But exceedingly few objects point inside themselves, and why hurt most for the doubtful benefit of a few?

Stacking D's uniform bitblitting of objects plus the postblit hook against C++'s copy-elision rules and rvalue references, I think D is making significant progress.

Eric: Rightly or wrongly, C and C++ performance hot-rodders are wary of garbage collection and its perceived performance penalty. What would you say to placate their fears?

Andrei: First, allow me to clarify the purpose of garbage collection (GC). GC is for writing safe programs with non-scoped allocation. If scoped allocation is all we need, we know how to typecheck programs for safety (Cyclone's regions and real-time Java are good examples); conversely, an unsafe program can perform unrestricted manual memory management. So again GC is for programs that a) need to be safe and b) use non-scoped allocation.

I'm rehashing this point because it's too often forgotten in arguments that frame GC as an indulgence, as the "easy way out" for languages and programmers alike. True, the infinite-memory model made practical by GC is easier to use, but the whole point of it all is memory safety.

Second, let's clarify that we're talking about a real cost, although the dimension of the cost isn't obvious. A 2005 study by Hertz and Berger, "Quantifying the performance of garbage collection vs. explicit memory management," has shown that real-world GC programs run about as fast as—and sometimes faster than—their deterministically deallocated exact equivalents, as long as they're allowed to occupy about 2–5 times more memory. When a program runs low on memory, relies on data locality, or must compete with other processes for memory, the overhead of GC grows and could become catastrophic. GC technology has improved since 2005, but with nothing earth-shattering, so I think that the above baseline holds.

Third, I should point out that many programs don't really care about garbage collection. RAM is plentiful, and few programs' core performance depends on data locality.

Where does D stand? D offers a garbage-collected heap used by default for class objects and built-in containers. If you want to write a safe program, just use new and you're there.

But you're asking about hot-rodding. If you want to fine-tune allocation, D's GC has a low-level interface that allows you to do unsafe things like freeing and resizing memory blocks. Furthermore, you can use malloc() and free(), along with the rest of C's standard library, and even the nonstandard alloca(), without any overhead. Then, a D primitive function called emplace allows you to construct objects at specified memory locations (à la C++'s placement new operator).

Most interestingly, D allows implementing memory-safe containers that internally use deterministic, unsafe allocation methods (for example, malloc() and free()), yet are encapsulated strongly enough to make any unsafe use impossible. I discuss that technique in depth in my InformIT article "Sealed Containers."

D applications still link in the garbage collector, and operations such as using new or concatenating arrays will use it silently. This is inconvenient for applications that need to make sure there is absolutely no use of garbage collection. Such applications can tweak settings to avoid linking in the garbage collector (which has a pluggable architecture). All uses of GC operations would translate in link-time errors, which is okay but not ideal.

Walter Bright is considering adding a compile-time flag that would banish all constructs that make implicit use of the GC, in which case you'll know at compile time where the culprits are, and you can change your code accordingly. Specialized library support à la boost::shared_ptr would be necessary. All that work hasn't been done yet, but it's well-trodden ground, so I don't foresee any difficulties.

Eric: Every garbage-collected language must grapple with the thorny issue of resource reclamation: Some resources need to be released deterministically, and GC is inherently non-deterministic. How does D deal with this issue?

Andrei: If there's a magic bullet, we haven't found it. Classes go on the garbage-collected heap; structs could go anywhere, but most often have a scoped lifetime. (As we just discussed, you could put anything anywhere with some effort; I'm talking about the path of least resistance).

The most difficult scenario here is a class that has a struct as a member. If the struct has a destructor, it will be run non-deterministically—or possibly not at all. Currently the D garbage collector calls all class destructors; but, as we know from other languages, it's best not to count on that possibility. If you need timely resource release for such embedded structs, you'd best do it manually.

All that being said, D takes certain measures that aim at simplifying matters:

The clear distinction between class and struct objects frames a design from day one, and it's a good statement of intent from the designer: "I define a class here, so I'm expecting an infinite lifetime model."
D distinguishes between destruction and deallocation. C++ conflates the two notions, which confuses a lot of people in a lot of ways. You see, memory isn't about just any resource; it's a very special resource. Unlike file handles, sockets, and mutexes, memory is the bedrock of the type system—everything that the language ever guarantees sits in memory. Close a socket, and you'll have errors reading from it—but no real harm done. Use a dangling pointer, and anything could happen—the type system is unable to hold any guarantee.

D defines for every object a primeval state in which it allocates no extra resources. You can put any object into such a state by evaluating clear(obj), which is a sort of "operator delete without the dangers." The universal availability of such a primitive makes it easy for generic client code to deallocate resources safely.

Finally, you don't need to use Java's awkward try/finally statements or the equally awkward C# using statement to clear resources in an orderly manner. To be brutally honest, I believe that both constructs are missing the point by a mile; to be brutally narcissistic, I believe that D's scope statement is a game changer. If you want to execute code upon a scope's termination, all you need to do is this:

auto wbdc = new WhizBangDatabaseConnection("wbdb://meh");
scope(exit) clear(wbdc);

It's lightweight, it's safe, it's deterministic. And it's a heck of a boon for reviewers, who won't need to follow complicated control flows. (You also get to execute code conditionally by replacing exit with success or failure.) I've been using this feature for years, and it scales phenomenally well.

To summarize my answer: D doesn't have a foolproof integration of GC and deterministic resource reclamation. However, it does offer a coherent framework facilitating resource control, and a statement that makes manual reclamation robust and effective.

Eric: I was surprised to hear that structs don't need inheritance—and from you, of all people, the standard-bearer for policy-based design! In many of your C++ designs, you use parameterized inheritance to customize the behavior of value types. Why don't you need this feature in D?

Andrei: Inheritance has at least two important purposes:

The classic case is subtyping: "I want to inherit Button and tweak its behavior to allow an animated background."
The other use is commonly called "inheritance of implementation": "I want to define Pool to offer what Factory offers, plus some other things." But I'd call this "symbol table acquisition," because sometimes no implementation is involved—think std::binary_function.

In C++, you'd want to use inheritance in both cases, mostly for practical reasons—you want the benefit of the empty base optimization (EBO), and you wouldn't want to write a bunch of forwarding functions. In D, you'd use classes in the first case and structs in the second.

Getting the benefits of EBO in D is very simple because of the static if construct. You see, if scope(exit) is a game changer, static if is a game enabler. I was very bummed that C++0x, for all its size, doesn't include anything as mighty as that. Here's how you avoid storing a useless member of type Factory inside an object of type Pool:

struct Pool(Factory)
{
     static if (Factory.tupleof.length == 0) {
         // Factory has no per-instance state
         alias Factory theFactory;
     } else {
         Factory theFactory;
     }
     ...
     Object create() { return theFactory.create(); } }

tupleof yields the direct data members of a struct, so for an empty struct the corresponding tuple would have length zero. In that case, the code just creates a symbolic alias—theFactory is the same as Factory. Otherwise, Factory holds state, so you define an actual member. From here on, using theFactory.create() dispatches either to a static or a full-fledged member function.

If you want all symbols inside Factory to percolate through Pool's interface, you use a feature known as alias this:

struct Pool(Factory)
{
     ... as above ...
     alias theFactory this;
}

This feature works as you'd expect—if the compiler looks up a symbol inside Pool and doesn't find it, it continues down theFactory's symbol table. (Notice how I combined static if and alias this for compounded effect!) This behavior is really what you want.

And then there's general static reflection, with which people have done crazy things—search online for whitehole d language, and you'll find a class WhiteHole that takes another class and implements all of its abstract member functions to throw. Great for mockup testing and partial implementations!

Defining inheritance for structs (which is possible in a sound manner) might simplify certain scenarios, but it isn't an enabler. I'm not sure whether such a feature would pull its own weight.

Eric and Andrei wrap up their discussion of D in "Interview with Andrei Alexandrescu (Part 3 of 3)."

Page 1 of 1

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Privacy Notice

Overview

Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information

Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security

Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children

This site is not directed to children under the age of 13.

Marketing

Pearson may send or direct marketing communications to users, provided that

Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
Such marketing is consistent with applicable law and Pearson's legal obligations.
Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out

Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure

Pearson may disclose personal information, as follows:

As required by law.
With the consent of the individual (or their parent, if the individual is a minor)
In response to a subpoena, court order or legal process, to the extent permitted or required by law
To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
To investigate or address actual or suspected fraud or other illegal activities
To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links

This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact

Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice

We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020

Email Address