Interview with Andrei Alexandrescu (Part 1 of 3)
My name is Eric Niebler. I'm a C++ developer by day. Andrei Alexandrescu and I have been friends for years, andin full disclosureI've long shared Andrei's curiosity about the D programming language. I even sat in on the D design meetings in Seattle with Andrei, Walter Bright, Bartosz Milewski, and others; in these highly technical meetings about language design, Andrei would gleefully "destroy" Walter about this or that issue, and then he'd "destroy" Bartosz, and me, and the girl behind the counter, and anyone else in range. So much destruction! If Andrei had a reality TV show, his catchphrase would be, "You're destroyed," which would issue out of the side of his smirking mouth in his characteristic sotto voce. Had he been a count in medieval Romania, his moniker might have been Andrei the Destructor. (A little programming language pun there.)
It's my pleasure to interview Andrei for InformIT about his new book, The D Programming Language, and have a chance to destroy him with some devastating questions of my own. "En garde!" I shout (donning my flame-resistant suit of armor first).
Eric: Hi, Andrei. First, congrats on becoming a father and a Ph.D., and for releasing a new book as well. You've certainly been busy! Second, as a hardcore C++ developer, I must say I am very impressed with D. Nevertheless, I have some nits to pick.
D has been around for quite a while now. Why is the time right for a book? And why you? Aren't you afraid they'll revoke your "C++ guru" card?
Andrei: Whoa, what a great question. Allow me to hijack it for some catharsis.
It wasn't easy getting used to this "C++ tricks guy" status. I was thinking I'd be at most "a design guy who uses C++ as his mechanism." It's been very surprising to me that many focused on the C++ tricks of my work more than on the underlying need to express designs in a simple and generic manner. A few years ago, in a talk at an ACCU conference in Oxford, I said these exact words: "I am not a C++ guru, and I never was." The audience burst into laughterand I was flabbergasted. It was like one of those sitcom moments in which there's a complete misunderstanding. I was being honest, and they thought it was all a ruse! (I must have seemed quite a pompous jerk to at least some.)
So I was unhappy about this typecasting. Remember the late Michael Landon from Little House on the Prairie? Nobody ever cast him as a villain! I didn't want to be a "Little Template on the C++ Prairie" guy for the rest of my career. I wanted to be a villain, tooso that's why I did research in machine learning and natural language processing. Plenty of mean things to learn.
Around 2006, my doctoral research was going pretty well, and I had decided to start defining my own language as a side project. There were just too many things I wanted to express that no language could model reasonably well. There were programming languages plastered all over the landscape, yet I was standing in this Sahara with no language oasis anywhere near.
That language, called Enki, had some quite ambitious goals. I was toying with Enki when a fateful discussion with Walter Bright at the Flowers Bar in Seattle disabused me of the notion of starting a language from scratch.
Eric: Was I there for that? I was, right? At the raised table in the corner on the left. Or was that a different meeting?
Andrei: Yes, the odd table with those old seats that must have served Jimi Hendrix's butt.
Eric: I remember when your vapor-language was called Weasel and you kept trying to get people to read your manifesto, like some geeky Unabomber.
Andrei: Yeah, Weasel was Enki 0.1. I remember how I did a search-and-replace changing Weasel to Enki in the whitepaper. Heck, it's still online.
Any language definition needs to deal with so much stuff, it ain't even funny. And as it's all subjective and difficult to quantify, it's inevitable that some decisions that seem good at one time look really crappy at others. Walter pointed out some aspects of Enki that he knew from experience will age badly, and in fact were already starting to suffer from conceptual rot. Instead of defining Enki from scratch and probably straight into oblivion, Walter suggested, why not use D as a basis, a starting point for a "better D" that I could help define?
At that time I didn't like DI found it unprincipled and derivative, in addition to having serious expressiveness problems for certain designs. We agreed to work together on these issues. And that's how D2 was born out of D1's rib: in two worn-out chairs in a smoky bar (they still smoked in that bar back then!) over unbearable noise. Soon thereafter, we started those legendary day-long meetings that often featured Bartosz Milewski, you, and others.
In order to shake off D1's mistakes, Walter decided to make D2 an incompatible branch while putting D1 in maintenance mode. That gambit caused understandable protests within the community, but has paid off immensely. Today, there are so many aspects of D2 that make me think, "Wow, we really did here what had to be done"most of which would have been impossible if compatibility with D1 was a requirement.
The book grew organically together with D2. I wrote what I thought was the right thing; Walter would review draft chapters as I wrote them; and whenever the book and the language were disagreeing over something, we'd negotiate changing one or the other. Oddly enough, we consistently saw eye-to-eye in all aspects that matter. In the book, there's a table cell with a skull-and-crossbones; that's the outcome of our largest disagreement.
Today, D2 embodies some ideas that I think are really cool and disruptive, and yet it has a level-headedness, a balance that's rare in this design space so prone to extrema. I am very happy to be a part of all this.
So, why is this the right time for a book? Because D2 is now mature enough to be shown to the world. Sure, it has some cilantro in its teeth and might emit a burp together with "Hello, world," but it really has something to say. Why me? Because I've designed some of D2, I know the rest as well as if I'd designed it myself, and because I believe all of that is worthwhile to share. About the guru card? Honestly: I'm not a guru, and I never was.
Eric: As a garbage-collected, natively compiled language, it seems to me that D stakes out a very small patch of real estate on the computing landscape: not as fast as a language with explicit memory management (C or C++), and not as flexible or portable as a JIT-ed or interpreted language (C# or Java). In your opinion, is there enough room in the niche D has chosen, especially considering that the niche is shrinking as JIT performance improves and GC can be added to C++ as a library?
Andrei: The question assumes two separate and definite choices: D is garbage-collected, and D is natively compiled. In reality, they are both options: D also supports manual memory-management (for example, you can call good old malloc() and free() without any intervening translation layer), and a definite subset of D is JIT-able (although at this time no JIT has been written for it). That doesn't make D restrict itself to a small patch between other languages; instead, it invades the territory claimed by each. Let me substantiate that claim below.
Let's see what a language needs in order to be JIT-able:
- It must be memory-safe. Virtually all JIT infrastructures today want to be able to validate and sandbox code.
- The language needs some means of dynamic binding and evaluation. This could be done entirely as a library, as Java and C# libraries have shown.
That leaves us with safety as the one core language feature necessary for JIT-ability. But D isn't safeit has unrestricted pointers (per my reference to malloc() and free() above). D allows you the decadent joys of adult lifeyou can cast a pointer to an integer and then back, if you really want to do that. If you want to punch a few electrons down unusual backstreets, D will be your trusty sidekick.
So where does that leave us? The cool trick is that D has a well-defined safe subset known as SafeD, and annotations that guide the compiler to distinguish safe from unsafe code. This isn't a new idea, but D integrated it in style. (We're talking condom, not chastity belt.)
D offers unrestricted pointers, but the vast majority of D programs never use them. (I almost forgot to insert a section in the book dedicated to pointerstrue story.) This is because D offers built-in arrays and because all polymorphic objects are garbage-collected and manipulated through restricted references (just as in Java and C#). As a powerful consequence, a meaningful, well-defined subset of D is seamlessly JIT-able. You can write entire applications in SafeD.
At the same time, inevitably some code will need access to unrestricted pointers. The JIT engine itself, the garbage collectorthey need to mess with untyped memory and at some point claim it became typed, or vice versa. Also, use of unrestricted pointers can make a good number of programs faster and tighter, and many important programs don't have an upper bound on performance needsbetter performance means trying more hypotheses, analyzing more data, wrecking the stock market more effectively, recognizing speech and images better, oras is the case of my employer, Facebookoffering more cool features to more users with relatively fewer machines. That's why D also supports unrestricted pointers, in situ allocation, stack-allocated arrays, and other features that are not provably safe.
Regarding adding garbage collection to C and C++, that's feasible (and has been done), but it's not very appealing. One of the major advantages of GC is provable safety, but C and C++ don't offer a coherent safe subset to use in conjunction with a garbage collector for a solid offering. You can't write a meaningful C or C++ program without using unrestricted pointers. C++ needs pointers for arrays, polymorphism (an odd breeding, giving birth to monsters such as "array of polymorphic values"), all self-referential data structures (such as lists, trees, and graphs), etc. Even if you use higher-level abstractions on top of pointers, such as std::vector or boost::shared_ptr, and even if you ignore the fact that they use unrestricted pointers in their implementation, their interface isn't sealedit still must use pointer semantics. Unrestricted pointers pervade C and C++ to the core. I'm not saying that's bad! (Though, heck, I kinda think it. Don't tell.) I'm just saying that erodes the appeal of GC and also works against JIT-ability.
It's interesting that D can serve such broad needs. I get to run a high-level, safe script; the core library it uses; and the underlying low-level garbage collectorall written in D. That's why D is of potential interest not only for C++ programmers, but also for users of languages such as Java or Python. The markets will decide whether there will be JIT offerings for the safe subset of Dthe language itself is prepared.
Eric: Some would say that the true strength of platforms like Java and .NET is their enormous class libraries. Does D have an equivalent?
Andrei: D doesn't currently have libraries of the size of Java's and .NET's. There are some obvious reasons, the simplest being D's youth and relatively low adoption.
However, I'm very hopeful about the future prospects in library space for D. Now the language is stable, which is goodlibrary design is parallelizable, language design is much less so. I'm already seeing some strong trends; recently there has been a surge of participation to the standard library. Until a few months ago, Don Clugston, Sean Kelly, and Iand of course sometimes Walterwould maintain and improve Phobos (D's standard library). Lately, several very strong contributors have shown up from all over the world, and they all "get it." It's a weird feelingI've been working at fundamentals like std.algorithm for years, and developed a specific style and a number of idioms and abstractions, always insecure about whether anyone would care. And now these people come and not only understand that stuff, they also take it to cool places that I hadn't imagined.
With D, you can get very close to the generic ideal, "This is the last implementation of linear search I'll ever need to write." Or binary search. Or stable matching. Or Levenshtein distance. I searched around, and it looks like D is the only language offering Levenshtein distance on UTF strings without copying the input and without special-casing per encoding. D just has the abstraction power to deal with variable-width entities efficiently and within a unified definition (that works with many other inputs, such as linked lists of floating point numbers, if you want). It's really the last Levenshtein implementation I need to write. Now, you may not care about that particular algorithm, but there are benefits of that power for many structures, algorithms, and design artifacts that might be important to you.
Stepanov tried exactly that with the STL, and succeeded to enough degree to offer a taste of what true programming is like. Unfortunately, the STL is currently marred by the fact that getting any of it going in C++ was a tooth-and-nail fight: no lambdas, no static if, no template dispatch, and overall so much syntactic noise that STL today is incomprehensible outside of C++. With D, all of that will change.
Other languages may seem to offer similar genericity, but they most often don't. The typical argument starts like this: "Of course we've taken care of generic search. First, you need to assume that everything is a singly linked list…" (or an object, or a hash table). The only civilized retort to that is, "What do you mean, 'all cows are spherical'?" Other languages are so confused that at some point they tried to talk the need for genericity out of existencedon't even get me Going.
Why am I saying this? Because many libraries are like many fish; a good language is like a good fishnet. It will be interesting to see what happens in library space now that D is stable.
Eric: Has anybody worked out a way to access Java's or .NET's class libraries from D? That would be interesting, and lessen the need for a vast D class library in the short term.
Andrei: Not that I've heard. There's only one .NET proof-of-concept implementation of a D compiler. There's also talk about translating Java code into D automatically or semi-automatically. Again, such things are not easy but are eminently feasible; market forces will decide what happens.
Eric and Andrei continue this discussion in "Interview with Andrei Alexandrescu (Part 2 of 3)."