From Mathematics to Generic Programming: An Interview with Alexander Stepanov and Daniel Rose
Introduction by John Lakos
This is not your typical “fluffy” book interview. Both Alexander Stepanov, inventor of the STL and author of Elements of Programming, and I have been programmers since the early 1970s. We believe in certain fundamental truths about programming, yet come from very different backgrounds: his, formal mathematics; mine, electrical engineering and computer science. We have each discovered for ourselves the importance of reuse, achieved by insisting on fine-grained modularity to allow more powerful, efficient solutions to be built from well-factored, generally usable piece parts. For example, for every C++ type that is intended to represent a value, there are certain intrinsic properties that we both feel must hold. Together with Dr. Daniel Rose, co-author of From Mathematics to Generic Programming, we explore some important nuances surrounding value as well as other contemporary technical issues related to their new book.
Overview
John Lakos: Alex, after all these many years of hearing about your prodigious contributions, it is a pleasure to finally make your acquaintance. It is likewise a pleasure to have the opportunity to talk to you, Dan. I have the honor of interviewing you both upon the publication, this past November, of your new book: From Mathematics to Generic Programming (FM2GP). I know firsthand how good that must feel — congratulations!
Alex, let’s begin with you. It’s been over five years since you and Paul McJones published Elements of Programming (EoP). What have you been up to of late at A9.com (Amazon’s search-technology subsidiary)?
Alexander Stepanov: In November 2009 I joined A9, a subsidiary of Amazon that does several algorithmically interesting activities: A9 represents the word “ALGORITHMS,” that is, “A” followed by 9 other letters. It develops product search, visual search, cloud search, and ad placement algorithms. Every time you search on Amazon.com, the query is answered by A9’s search service.
I have been involved in a variety of projects at A9. Together with Dan and my other collaborators, Ryan Ernst, Paramjit Oberoi, and Anil Gangolli, we worked on SIMD algorithms for posting list compression (http://www.stepanovpapers.com/CIKM_2011.pdf), finding and eliminating data structure bottlenecks in the search engine, and refactoring the code of our associative storage. (Dan became my manager in order to be able to work with me.)
Eventually, management asked me to teach a course for A9 engineers. It’s great that there is a company in Silicon Valley that supports teaching a course covering mathematical and historical roots of generic programming. When Dan and I suggested several potential subjects for the course to be taught at A9, the president of A9, Bill Stasior, chose the least applied and the most mathematical course of the ones we suggested. Bill, and after he left for Apple, his successor Brian Pinkerton, not only supported the course, but even attended the lectures. Then Brian strongly encouraged Dan and me to write the book, which is the subject of the interview.
John: Dan, tell us a bit about you: What are your interests and responsibilities at A9.com?
Daniel Rose: The most recent role I had at A9 was as Chief Scientist for Search. In that capacity I wore several hats: I initiated strategic projects involving new search technologies, I served as a kind of internal consultant to A9 and Amazon on a variety of search-related issues, and I led a very small team called Fundamental Data Structures and Algorithms for Search, which we referred to as “F4.” Alex and I created the team together.
I actually left A9 about a year ago. Since then, besides finishing the book, I’ve been catching up on lots of reading, and thinking about what I want to do next.
John: So let’s dig into this new book of yours (FM2GP), which comprises many interesting and, as your book seeks to demonstrate, interrelated topics:
- A carefully curated history of relevant mathematics
- A corresponding collection of succinct biographical sidebars of many of the outstanding contributing mathematicians
- A gentle exposure to several cooperating areas of mathematics — e.g., geometry, abstract and linear algebra, and number theory
- How a solid appreciation of mathematics leads to increasingly good programing — specifically, generic programming — through the effective use of concepts
- How it is possible to build upon maximally reusable fine-grained components — i.e., algorithms and value types — to achieve robust solutions to highly complex problems, such as in cryptology (chapter 13), without any loss in efficiency.
What would you add or subtract from this summary?
Dan: These are certainly all topics we touch on in the book, though I’m not sure they’re all equally important. #4 might be closest to what I think of as the main theme, while some of the others are the methods we use to tell that story.
Perhaps the only thing I’d add is that the book also contains some of what we call “laws” of programming — basically, principles to follow when writing generic code.
Alex: I would say that the book is more than the sum of these parts. It is the connections between these topics that are fascinating. I also believe that insights that programming brought into the world will affect all other human endeavors. In my opinion, the book should be judged on how successful it is in showing the unity of the different threads.
One of the things that I believe good programmers need is a sense of what is beautiful. Whether they refine this sense by looking at art or listening to music or reading literature, they need to learn to apply this to programming, to see when an algorithm or a piece of code is beautiful. This sense allows us to recognize well-balanced, well-designed large systems. Studying mathematics is one of the ways to develop this aesthetic sense. When we see Euclid combining 48 propositions together in a beautiful way, we learn how one might build up a complex piece of software from 48 small, easily understood components. It’s not the facts about triangles that we care about, but the skill in constructing a framework from which these facts emerge.
John: Fascinating. Alex, it’s no secret that you are passionate about mathematics; tell us what inspired you to write this book.
Alex: I have been trying for many years to defend a view of programming as a mathematical discipline. This book is another attempt to spread this view. Also, I take a long view on the development of fundamental computational ideas. Somehow the intellectual struggles of Pythagoreans are more relevant to me than the discussions on the C++ standards committee that appear outdated even before they start. I care more about issues that thinkers have struggled with for centuries, rather than current debates which may soon be forgotten.
John: Dan, how is it that you came to be a co-author of this book?
Dan: Alex and I had been working together for a couple of years on the F4 team that I mentioned earlier. One of the benefits of working with Alex is getting to hear his insights on programming, math, and many other topics. Members of our team and a few others at the company spent many enjoyable hours discussing and debating these ideas in Alex’s office. Eventually, we and our management realized that he could have a greater impact by presenting some of the material in the form of a course, much as he had done earlier in his career at Adobe.
As Alex developed the course material, he enlisted me, as well as our F4 colleagues, to give him feedback, refine the examples, and so on. I don’t have a strong math background, so a lot of the material was new to me, and I often pushed Alex to refine his explanations or provide concrete examples so I could understand it better.
Early in our working relationship, Alex and I had discussed the idea of writing a book together based on our project at the time, writing a search engine from first principles. Although that project got shelved, the book idea stayed in the back of our minds, and Alex mentioned it again as he was developing the course.
So when he started giving the actual lectures, I took copious notes and started turning them into prose. After a few weeks, I showed them to Alex, who agreed that they had the potential to be part of a book. Of course, there was another year of work after this, but that’s how it started.
John: Dan, I think many of your readers will thank you for your influence — especially because you come from a background outside of mathematics.
Your book begins (p. 1) with the sentence, “This book is about programming, but it is different from most programming books.” What unique perspectives do you bring to programming that would entice people to read this book, and who is your intended audience?
Alex: Most books on programming try to deepen the understanding of some particular subject: using C++ templates, writing functional data structures, etc. This book is trying to broaden the intellectual outlook. I always think of young programmers as my audience. Hopefully, the book will get to some young programmers and inspire them to look at the fundamentals of their profession, not just only at specific tools or techniques.
Dan: A big difference is that our book is not competing with what we might call “instructional” programming books, particularly those that systematically go through the capabilities of a language or a set of problem situations and solutions. There are some great books like that, and I love some of them. I still have the first edition of Kernighan and Ritchie on my shelf. So this is not in any way a criticism of those books; I think ours complements them. But our book is more about some underlying principles of programming, some ways of thinking. Perhaps another way to think of it is that most programming books tell you what to do, while ours tells you why it should be done that way.
I believe the audience for our book is primarily professional programmers who are interested in getting a different perspective on their work and learning how they could do it better. It could also be a textbook for an advanced undergraduate class.
John: Professional programmers today are constrained by deadlines and other time commitments, forcing them to make hard choices. What will typical programmers learn from your book that makes it essential reading for them to be more successful, given the host of other newly minted programming books by prominent repeat authors such as Josuttis, Meyers, and Stroustrup?
Alex: Our book is not about some particular programming language or technique. It is an attempt to teach people to look for mathematical abstractions that underlie their programming tasks. Discover the mathematics first, then map it into zeros and ones, and only then worry about implementing in a particular language.
John: I enjoyed learning about the history of the careers and even the personal peccadilloes of several prominent mathematicians. In particular, you write that Carl Gauss, “the Prince of Mathematicians” (p. 137), refused to credit published work of the son of one of his colleagues (though Gauss himself privately called the son “brilliant”) because, though independently conceived, it closely paralleled his own unpublished work (p. 166). What inspired you to interleave such anecdotes with the more technical material throughout your book?
Dan: Alex’s enthusiasm for sharing the stories of the great mathematicians was infectious. In the lectures that the book is based on, he often talked about the historical context in which these people lived, and how the scientific conventions of the day affected the work, and so on. I became convinced that this history was as much a part of the story as the theorems and algorithms.
Alex: Science is a human activity. You cannot understand it without knowing the human aspects of it.
Dan: In addition, we deliberately made an attempt to interleave mathematical, programming, and historical material, so that a reader more familiar with one approach than the other would not feel overwhelmed or intimidated.
John: Overall, I found that your exposition of the mathematics impressively accessible. In particular, the way you presented the hierarchy of abstractions in mathematics (chapter 6) by starting with the definition of a group (i.e., a set of elements having an associative operation, an identity, and an inverse) and then successively removing axioms, yielding first a monoid (no inverse), then a semi-group (no identity), and finally — for completeness — a magma (no associative operation) was delightfully effective and memorable. Anticipating section 9.6, tell us how you came to present that material in this rather different, “top-down” fashion.
Alex: The exposition is historically informed. A good analogy comes from the world of classical music, where in the late 20th century scholars began to re-imagine how music should be performed based on historical practices. This approach had a broad effect on how classical music is performed and perceived today.
A similar thing has happened in mathematics, where historically informed texts have begun to change the way concepts are introduced and understood. My own view is that you cannot fully grasp mathematics until you understand its historical context. I spent a lot of time studying primary sources: Euclid, Archimedes, Pascal, Euler, Gauss, etc. Many of the proofs in the book are borrowed from their works.
While in mid-20th century texts, semigroups came before groups, historically it was the other way around.
Dan: I’m glad you found the material accessible; this was something we worked on consciously. For me, the order of presentation was less important than giving the reader a conceptual framework that shows how the pieces fit together. For example, once we had created the algebraic structure diagrams at the end of chapters 6 and 8, the whole picture became much clearer in my mind. “Oh, a monoid is just a semigroup with an identity element.” Remember, Alex’s course was my first real exposure to this material.
John: At the end of chapter 11, you write, “Theory and practice are two sides of the same coin; good programmers depend on knowledge of both.” There are a substantial number of lemmas and proofs in your book, many of which seem somewhat removed from what most programmers need to know day to day — for example, the proof of “If 2n - 1 is prime, then 2n-1 (2n - 1) is perfect” (p. 32). Why is proving such theorems in number theory relevant for typical programmers?
Alex: These theorems are part of the story. Fermat’s interest in number theory was inspired by the study of perfect numbers, and without Fermat’s work there would be no modern number theory on which much of abstract algebra is based. And it is my experience that without at least a smattering of abstract algebra, programmers have difficulty with concepts such as iterators. One has to learn some non-practical things to be able to use mathematics.
Dan: In addition, Fermat’s results are central to modern cryptography, which we discuss later in the book.
John: Modern CPUs, with their complex caches, can, as you suggest (p. 211), make it hard to predict the performance effects of our design choices. How have such practical hardware advances affected your approach to programming?
Alex: I started writing benchmarks 40 years ago, and I never stopped. One has to write benchmarks to understand the ramifications of algorithms and data structures. Algorithmic benchmarks need to be published. They were published in the early ‘60s when ACM started its Collected Algorithms, but they quickly disappeared. The same should be done for algorithmic validation suites. They must be published.
Dan: When we worked on variable-length integer encoding, we discovered that many commonly accepted practices about the best way to do something were wrong, because processors have changed. In particular, we found that it was often more efficient to read a bunch of unwanted data rather than test to see if it was needed, because the cost of branch misprediction was so high.
Alex: We discovered that unaligned reads were an important technique. We also found that SIMD instructions were now standard enough that they could be useful for many applications beyond 3-D graphics. Another example is the impact of caches on data structures. Node-based structures make much less sense today, because the impact of cache misses is so great. I look much less kindly on linked lists.
Dan: The lesson for me was that just because a technique is widely used or cited in textbooks, doesn’t mean it’s the best. And even if it is the best according to some standard measure, that doesn’t mean it’s the best for your problem.
John: This sounds like excellent practicable advice from engineers who practice it. At the very end of the last chapter before Conclusions (chapter 14), you write (p. 248), “… it is impossible to know which theoretical ideas are going to have practical applications.” What is the take-away message here? What should your readers strive to do differently, and what do you foresee would be the resulting benefits?
Alex: The point is that you don’t know the benefits in advance. You need to prepare your mind with the foundations. Many practical techniques of today will be obsolete in the future; at the same time, many practical techniques of the future will never be discovered without a sound theoretical basis. Programmers don’t just need training; they also need liberal education.
Generic Programming
John: On the first page of your book you define generic programming as “an approach to programming that focuses on designing algorithms and data structures so that they work in the most general setting without loss of efficiency." Alex, according to the interview you gave to InformIT upon publishing your first book back in 2009, you said that the term generic programming, coined by you and David Musser in the late 1980s (inspired by Aristotle’s idea of genus (FM2GP-p. 180)), had come to mean something less than what you intended, and therefore you deliberately chose not to use it in EoP. What, in your mind, has changed in this regard over the past five years?
Dan: When Alex proposed the course that the book was based on, he described it as “math that programmers should know.” Later, when we were trying to reorganize the material into a book, we were discussing how to cast it into a coherent narrative. Was it just a bunch of interesting theorems and programming ideas? What was the story we were trying to tell? Then Alex came up with something like what we say at the very end of chapter 1, and it was like a light bulb came on for me. It sounds silly, but I hadn’t realized that the story of these mathematical developments was the story of generic programming. So I said that we had to make that connection explicit. The book reflects that decision in many ways, not least of which is the title.
As far as using the actual term “generic programming,” I felt it was time to stand up and reclaim it. I don’t like the idea of tiptoeing around it just because someone else has misunderstood or misused it. Alex is strongly associated with STL and generic programming, and if people have that association, we should at least give them a chance to learn what the term actually means.
Alex: Dave Musser and I introduced the term in our 1988 paper to describe a process of finding the most general setting for an algorithm. Since then people started using it to refer to certain mysterious programming techniques that I find mostly counterproductive. For this reason Paul McJones and I decided not to use the term in EoP, but Dan recently convinced me that it is an important term and we should reclaim it. We’ll find out if we were successful in a decade or two.
John: In section 8.1, you profess that the essence of generic programming is that “an algorithm in one domain can be applied in another similar domain.” In section 8.7, you assert, “To make something generic, you don’t add extra mechanisms. Rather, you remove constraints and strip down the algorithm to its essentials.” What advice or caveats can you give us (a la section 7.1) regarding how best to implement generic algorithms in C++ today?
Alex: My first piece of advice would be that before you try to design your own algorithms, learn which ones are already in the library and how to use them. If you look at a lot of code in industry — even at top companies who hire top programmers — you see a lot of examples where someone implemented something like lower bound from scratch, inefficiently and often incorrectly, without realizing that there is a highly tuned function in the library (in this case, std::lower_bound) that could be invoked with a single line of code.
If you actually do need a new algorithm, you need to figure out what it should do mathematically — what is the correct abstract interface? The C++ implementation should be the last part.
Dan: That’s not entirely true, Alex. In fact, you’ve often said to me — and we say in the book — that the correct interface usually isn’t clear until you’ve tried to write an implementation and use the algorithm. Only then do you discover, “Oh, I should have returned this additional value.” It takes multiple iterations to see clearly what needs to be done.
Alex: You’re right. In fact, many STL algorithms went through dozens and dozens of iterations, and a few are still not the way they should be.
I should also mention that writing good generic algorithms can be done in many languages. Many STL algorithms survived through multiple language incarnations: Scheme, Ada, C++, Java. Some of the code for FM2GP was first written in Python.
Dan: Some people think FM2GP is a C++ book, but it isn’t. (In fact, the whole reason we provide an appendix describing a few C++ features is so that programmers who work in other languages can read the book.)
John: At the beginning of section 8.5, Matrix Multiplication and Semirings, you indicated that this section and the next one require some knowledge of linear algebra and may “be skipped without impacting the reader’s understanding.” Yet I found section 8.6, Application: Social Networks and Shortest Paths, to be quite interesting and relevant. In this short section, just barely over two pages, you show how to repurpose the fast algorithm developed for powers of integers (chapter 2), and later genericized for semirings (chapter 7), to do transitive closure on Boolean matrices. How would you advise us as to when to reuse a generic algorithm, such as fast powers, versus writing a special-purpose one — in this case, Warshall's?
Dan: Perhaps we should have been clearer; the comment was intended to convey that it was not necessary to understand the linear algebra applications in order to understand what comes later in the book. Again, the goal was to make sure we weren’t limiting our audience. If someone doesn’t know any linear algebra, we don’t want them to look at the book and say, “Oh, this isn’t for me.” But I agree that the applications in section 8.6 are some of the most compelling in the book.
Alex: The issue isn’t “generic algorithm versus some other algorithm.” There are lots of good algorithms, but they should all be as generic as possible. Therefore, I would recommend developing a generic version of Warshall’s algorithm. Depending on the situation (for example, the graph representation), the generic power algorithm might be better or worse than the generic Warshall’s algorithm.
In fact, in the mid-1980s, Aaron Kershenbaum and I started working on a library of generic graph algorithms. Sadly, we were interrupted and I never managed to get back to this activity.
John: In chapter 12, you advocate using a simple int to represent a small non-negative integer (p. 221). Please tell us why we should not instead use unsigned int or, for that matter, auto?
Alex: The language guarantees that int is a natural word size and, therefore, fastest to use. Although in this section of the book we were not intending to focus on which integer type is best in different situations, this is a case where int is actually preferable to, say, uint8_t, even though the values are all positive and would fit in 8 bits. Nothing more profound than that was intended.
John: Alex, you may not have consciously intended more, but I think there is much more here to be discussed that ties directly into what you advocate in your use of concepts in generic programming. For example, many programmers have been taught that, if a parameter or return value is known not to be allowed to be negative, it should be made unsigned to make that property known in the code itself, and yet you explicitly chose int over unsigned int. As you know, the concepts that these two types model are not the same. For example, int models the general concept of integers, which admits negative numbers, whereas unsigned int models a different concept, which does not. Separately, some would advocate the use of unsigned int in a functional interface merely to extend the useful domain (FM2GP-p. 113, EoP-p. 10) of the machine-word-size integer type up a bit (literally) — even though, by doing so, they are changing the semantics of the underlying algebra of that integer (risking all manner of woe with conversions — e.g., with integer literals, all of which are signed). Please tell us under what circumstances you would deliberately opt for an unsigned int over an int to model an integral type in a functional interface of your design.
Alex: That’s a complex question, to which I do not have an immediate answer. But I should point out that in the example, the type was not part of the function’s interface; we could have used unsigned int, but that would require initializing the value to Ou.
Dan: I am personally not a fan of auto. When I choose to program in a language where types are important and where decisions about which types to use impact efficiency, I want to make those decisions explicit and visible. I think auto decreases readability of the code.
Alex: I agree with Dan on the use of auto.
John: I’m sure that many practicing professional programmers would agree that using auto wherever it’s syntactically legal in C++ — e.g., auto i = 0; — is abuse, myself included. I am, however, a bit surprised that you folks in particular are not in favor of the use of auto for those occasional cases where the specific type is not relevant, but the concept that it models is. The oft-cited poster child for the “appropriate” use of auto is when one is returning an object (whose type satisfies the iterator concept) from a member function such as begin or end. What matters here are the well-known operations you can perform on this object, governed by the appropriate iterator concept (FM2GP section 10.5, pp. 185-188); its C++ type is all but irrelevant, and might even change over time. Sure, one could always create a short alias using, e.g., a nested typedef (achieving almost the same effect, explicitly), but tell our audience why you both feel — even in this specific case — that auto is still not the best answer.
Dan: I did not mean to imply that there are never appropriate uses for auto, and your example may be a good one.
Alex: Also, in the book we tried to be as language neutral as possible; we wanted a Java programmer to understand the examples as well as a C++ programmer. In general, I try to be very conservative in using new language features. Since programs will be read by other people, and many people know only the basic language features, I tend to stick to these basic features.
Values and Value Types
John: There seems to be an essential underpinning of your overall message that you deferred until chapter 10, and even then addressed only relatively quickly. I’m talking about the definitions and subtle notions surrounding value, what it means, and why it is important for programming — especially generic programming. How important is it for good programmers to have a strong intuitive, yet precise, notion of value and why?
Alex: The idea of value can be traced back to Pythagoras, who stated that everything in the world could be represented as integers. The integers together with their interpretations are the values. The same integer can mean a shoe size in one context and the number of search results in another; without the interpretation, it’s just meaningless (meaning-less!) data.
We still use Pythagoras’s idea. For us, a value is simply a sequence of bits (a binary integer) together with its interpretation.
Regular Types and Value Semantics
John: You define what you call a concept as “a set of requirements on types” (p. 182). Why are concepts important?
Dan: When Alex first explained concepts to me, he said something to the effect that concepts are to types what types are to values. If I write a function foo(x), and you want to use the function, it’s important for you to know what values x might take. In the very same way, if I write a generic function and you want to use it, it’s important for you to know what types it’s going to work for.
John: You define a particular concept that you call regular (sections 7.2 and 10.3). What is it about this concept that makes it special?
Alex: Regular types are types than can be operated on by STL algorithms and reside in STL containers. STL containers themselves are specifically made regular so that they can in turn be operated on by STL algorithms and reside in other STL containers.
Dan: For me it’s helpful to think of “regular” as evoking the English usage of the word. Let’s say I decide a want to buy a new chair, and you ask what kind of chair I want. A stool? A beanbag chair? Finally I answer that I just want a regular chair. What I mean by this is, “I want an example of the class of chairs that behaves in the way I normally expect chairs to behave.” The concept regular gives us a kind of formal way to say “these types behave in the way we normally expect types to behave in a program.”
John: Dan, I know what you mean. Non-regular C++ types are, however, designed and used routinely by successful developers — especially when they are not attempting to approximate some theoretical mathematical type, but instead are providing a useful device or service (e.g., a socket, scoped_guard, or thread_pool) within the confines of a running process. I think the way Alex put it, however, is spot on.
You write (p. 183), “Having a copy constructor implies having a default constructor, since T a(b) should be equivalent to T a; a = b;.” Some object types, such as Date, have no obvious default value, and leaving a user-defined object in an uninitialized state can sometimes lead to other grief (e.g., uninitialized memory reads). Eliminating the default constructor entirely (although it is required, by definition, in EoP) would seem to relieve the copy constructor from this semantic requirement. I didn’t notice any uses of default constructors (even compiler-generated ones) in your book. How does omitting a default constructor interfere with the essential nature of a C++ object type?
Alex: As we write in EoP, the default constructor constructs a partially formed object. The only things you can do with it are: assign to it, and destroy it. It is true that we do not use them, but other people do. So it is good to formalize the notion.
John: Alex, not to put too fine a point on it, but if you don’t make use of default constructors, then certainly an otherwise regular type without a default constructor is quite useful indeed! The only additional motivation I noticed in your books for incorporating default construction in the regular concept was (EoP-p. 7, FM2GP-p. 14,) that it allows us to describe regular as being modeled after the built-in types in C/C++. In the spirit of removing as many requirements as possible (in order to accommodate the widest possible audience) while still preserving the essential capabilities of useful algorithms (FM2GP-pp. 1-2, 127, 141, 249-250), might it not have been better to define a concept such as regular that does not require a default constructor (for use with most algorithms), and then provide a specialized, more restrictive concept that does (for those few who actually need it)?
Alex: The role of a default constructor is that it constructs the object so it can be assigned to or destroyed. Nothing more. EoP calls these “partially formed” objects. Default construction does not guarantee that the initial value is meaningful, or even the same across invocations. Of course, if you write your own default constructor for your own class, you may choose to initialize it with a useful value, but this is not part of the requirement.
In particular, I want a programmer to be able to write
T a; // value of a is undefined if (test) a = T(1); else a = T(2);
This is a very common case, and I don’t want a programmer to struggle with conditional expressions in initialization. Moreover, the idea of partially formed objects is an important one. One could, for example, have an array of partially formed objects generated by a default constructor if one writes
T a[5]; // values of a are undefined
So far, I never encountered a case where it caused difficulty, so I do not see a need for a special concept where default constructor is not available.
The concept “regular” represents an important bundle of constraints that are found in built-in types. Of course, one can define other concepts with different sets of constraints (and we have an example of one in the book, semiregular), but in general you should be very cautious about introducing new concepts. Otherwise you get things like “minusable” — a concept for types that support subtraction, but not necessarily addition — which make little sense. (Subtraction is, after all, the inverse of addition.)
Concepts and C++
John: In your book, you sidestep the current lack of language support for expressing properties on type parameters in C++ templates by using the preprocessor to #define concept names, such as noncommutativeAdditiveSemigroup (section 7.3, p. 116), to the keyword typename, so that they can be used instead, even though they serve only as documentation to the reader. How important is it for compilers to enforce constraints on type parameters?
Alex: Concepts already exist, even if the language does not provide mechanisms to check for them. I think it’s much more important for programmers to understand and use concepts than it is for the compiler to check them. Having said that, I have been strongly advocating inclusion of concepts into C++ since the early 1990s.
Dan: I know Alex will say that thinking about concepts is more important than having them in the language, and that may be true. But the same could be said for types. Sure, you could have a compiler that treats types as mere documentation, but if you are choosing to work in a typed language, why would you want that? It’s just leaving you open to a whole class of error that you wouldn’t otherwise make. The same holds for concepts.
More generally, if the programmer is willing to give the compiler useful information, shouldn’t the compiler use it? Similarly, if the compiler has information that would benefit the programmer, such as the number of fields in a struct, shouldn’t that be exposed?
John: No argument here. There is a proposal for concepts lite (N4377) based on the Palo Alto proposal (N3351), to which Alex contributed, working its way through the standards committee. What impact will concepts being an integral part of the C++ language have on programming practice?
Alex: The most immediate impact will be the dramatic improvement in error messages while using STL. As far as the long-term impact, I reiterate what I said above: Programmers need to learn how to think about concepts in order to benefit from them. Since many people are not used to thinking this way, it may be years before programming practices change significantly. Hopefully books like ours will help people make the transition.
John: What features not already part of the concepts-lite TS would you like to see added in order to support the style of programming you espouse?
Alex: I worked very hard to get STL included in the C++98 standard. It takes a particular set of skills to work through the standards process — not only technical knowledge, but also patience, diplomacy, and the willingness to compromise. I greatly admire people like Bjarne Stroustrup who possess these abilities, and indeed, Bjarne's consistent advocacy to steadily improve a single language over a period of decades is unmatched in the computing world. But those are not my strengths, and for that reason, I have mostly stayed away from standards work for several years, and have not followed the proposals closely.
As you know, the attempt to get concepts into C++11 collapsed, leaving several of the participants in the process feeling bitter and disengaged. It was in the wake of this failed process that I organized the workshop that created the Palo Alto proposal. Our goal was to come up with a bare bones framework for concepts that most of the participants could agree on. Much of my involvement actually happened beforehand, drafting a straw-man proposal to provide a starting point for discussion, getting competing factions to agree to sit down together and have the discussion, and then moderating those discussions during the workshop. I should mention that Dan and several other people at A9 (Anil Gangolli, Ryan Ernst, and Jon Kalb) also participated in the workshop and contributed to the proposal.
I feel that our work was successful in the sense that it got people talking again, and hopefully will lead to some form of concepts being in the language. But beyond that, I have not been, nor am I planning to be, involved in specific proposals. To paraphrase President Kennedy, the torch should be passed to a new generation of generic programmers.
John: Would you consider releasing a new edition of your book once concepts finally become part of the C++ programming language?
Dan: We’d certainly consider it if our publisher were interested. Hopefully the changes would be relatively minor. Of course, there’s a slippery slope: Once you start revising, you think of all the other things you might have written differently, or topics you should have included, and next thing you know you’re rewriting the whole book. Since FM2GP was just published, we’re not ready for a major rewrite.
Contract Assertions
John: In this new book (just as in EoP), you document preconditions, e.g., that a number is positive (section 4.5, p. 56), differently from constraints on type parameters, e.g., that the value_type of each of two distinct iterator types be the same (section 11.2, p. 201). Preconditions consistently appear in the implementation (at the top of your function bodies), whereas type constraints (albeit currently as comments) are always located within the function declaration itself. Tell us why you make such a crisp distinction between concepts and preconditions given that both serve as constraints on valid usage.
Alex: Concepts are requirements on types; preconditions are requirements on values. A concept might indicate that a type of a value is some kind of integer. A precondition might state that a value is a prime number.
John: Yes, and you delineate this distinction quite lucidly in EoP (p. 13).
Dan: There is a performance cost to enforcing runtime constraints, such as preconditions. There is no performance cost to enforcing concept restrictions, since they are handled at compile time.
Mechanisms (I.E., Non-Value Types)
John: You assert (p. 5) that in order to be a good programmer, you need to understand the principles of generic programming and, hence, the mathematics upon which it is based; this book clearly and convincingly supports that claim. Are there any recent cases where, as a software engineer, you’ve fallen back on more traditional C++ language facilities such as inheritance and virtual functions?
Dan: Neither Alex nor I believe that generic programming and object-oriented programming are in opposition; one can use objects and still embrace the generic programming approach. Furthermore, I don’t think of using OOP features as “falling back,” which implies that this is a compromise. There are situations where inheritance is useful, and I am glad to have it in those situations. Even Alex agrees that there are some situations where the use of objects in application frameworks is practically useful. If you’re building an app with a standard user interface, you don’t want to design your own windows and controls from scratch, you want to inherit from the ones in the framework.
Alex: In my case, I do not use object-oriented features in the work I do. As you know, I have been critical of the claims for OOP. In particular, the way inheritance is implemented in C++ and Java is fundamentally flawed. However, I am not opposed to the idea of inheritance at all. Conceptually, VectorSpace does inherit from AdditiveGroup, and AdditiveGroup inherits from AdditiveMonoid. We show this in the diagrams at the end of chapters 6 and 8. However, in current languages this inheritance cannot be expressed. Inheritance is not able to deal with binary methods such as +. There are no virtual type functions to deal with a ring of coefficients in VectorSpace. If somebody passes you a pointer to an element of a VectorSpace, you need to obtain the type of its coefficients at run time.
I tried for several years to see if there was a way to implement this in current OO languages, but I eventually understood the reasons why it was not possible. I hope that one day language researchers will provide a way to implement the inheritance that is needed.
John: Let’s say we have a need to implement a concrete mechanism, such as a thread pool or a scoped guard. (What would it mean to assign one object of such a type to another?) In your view, should programmers attempt to somehow make every C++ type regular, or can they reasonably omit regular syntax, such as copy construction, assignment, and equality comparison where such operations have no obvious semantics?
Alex: As we discussed earlier, one benefit of being a regular type is that you can reside in an STL container and be operated on by an STL algorithm. Another benefit is that your type will behave in a way most programmers probably expect. However, there may be particular situations where other factors are more important, in which case, of course you might want a type that’s not regular.
Memory Allocators
John: There are a few places in your book where you descend from the purity of mathematics into the adulterated world imposed by modern computers. One particularly fetching example of impurity, which you characterize as “a sad story…” (p. 217), involves how Alex, in his original implementation of the STL, when in need of a maximally sized temporary physical buffer, was forced to acquire it by what amounts to “trial and error” — which you lament has persisted in vendor code to this day. Alex, tell us how memory allocators came into being in the STL: Were they necessitated solely by the same early Intel architectures (related to near and far addresses) that have resulted in the vestigial pointer and reference iterator traits, discussed in section 10.5 (p. 187), or was there some other overarching motivation?
Alex: Some people believe in adding every mechanism that might be needed as an option to the template class. I do not share that view. Concepts should be as minimal as possible.
In the case of allocators, I was forced to invent them in order to get Microsoft to agree to consider including STL in the language. (In the end, they actually voted against STL anyway.)
Allocators were a terrible idea; instead of adding a parameter for what kind of pointer to use, there should be more than one vector type, one for each memory model. As long as the different vector types satisfy the same requirements (concepts) everything would just work.
The whole point of generic programming is to make things simple, not to build everything-and-the-kitchen-sink policies and pass these policies around.
Verification and Testing
John: At the end of Chapter 3, you talk about the importance of multiple proofs for the same theorem (p. 38) — especially when they come from such diverse branches of mathematics as algebra and geometry (e.g., section 4.5) — because of the increase in confidence in the result, which goes directly to the importance of redundancy in testing. However, proofs can be suspect (p. 159) and, even if one “knows” that they have the right algorithm on paper, “there’s many a slip ‘twixt the cup and the lip.” How would you advise your readers to make sure that their code works as they intended — e.g., gcd (p. 59)?
Dan: There are two issues here. One is to prove that the algorithm does what it’s supposed to. We have examples of that in the book, for example in sec. 4.7. The second is to insure that the code actually implements the algorithm. We’ve been a little less helpful here. We do have some very simple test cases available for download on the book’s website (www.fm2gp.com), along with all the code that appears in the book. But a good software engineer would want to use a framework like CppUnit to run tests in a more systematic way, trying important edge cases and generally exploring the space more thoroughly.
Alex: Ideally, someone would design a generic validation suite for GCD (and other related algorithms, such as Extended GCD). The same goes for performance benchmarks.
John: So I’m hearing from Dan that having a good testing framework is important, and from Alex that having a thorough test suite — irrespective of the testing framework used — is important. Obviously one without the other is suboptimal. In my experience, however, the hard part is coming up with comprehensive tests, as one can easily obtain a decent open-source test framework.
Elements of Programming (EoP)
John: You list EoP under Prerequisites (p. 3) as “a useful companion to this one.” How would you contrast the two books? How are these two books related, and how are they (and their intended audiences) different?
Dan: Whether we succeeded or not, we intended the audience for FM2GP to be broader than that of EoP.
EoP is, by design, a math book. By that I mean that it is written in the style and appearance of a book that a mathematician or a serious math student (say, a math major) would read. That gives it a kind of elegance and clarity that pays homage to classic math texts of the past as well as its partial namesake, the original Elements of Euclid. But that also makes it, as Alex often says, a difficult book, and one that can be intimidating to someone (like me) without a strong mathematical background.
FM2GP is different in several ways. I think the biggest difference is that, as we say at the very beginning, it is a programming book. It is written in the style and layout that programmers will probably find more familiar. There is also a lot more explanation. Where EoP might say, “The solution to X is Y, and here is the proof,” FM2GP would say, “X is a problem that arises in such and such circumstances. Here’s an example of it, and here’s why it’s important. Eventually so-and-so discovered that the solution to X is Y. Here is the proof, and this is how to understand it.”
Of course, another obvious difference is that — despite some overlap — the books mostly cover different material.
Our hope is some of the readers of FM2GP will have their interest in certain topics piqued enough to want to delve into the detailed formal foundations found in EoP.
Alex: EoP follows a deductive approach, starting from the first principles and methodically building up on them. It is modeled on Euclid’s Elements. FM2GP uses a historical-inductive approach.
John: In the August 2009 InformIT interview, Alex’s co-author Paul McJones commented that he was not comfortable with the “… conversational style” of Alex’s lecture notes, and pushed for a more impersonal (formal) one for that book. Dan, tell us in what ways you feel you influenced the style of your new book, as discussed briefly in the authors’ notes (p. xiii), and how you would characterize it compared with that of EoP.
Dan: To put it very simply, Paul was a math major in college, while I was a philosophy major. We both went on to be computer scientists, but I think the style of the respective books reflects these origins.
In particular, the fact that I lacked a math background meant that I was constantly stopping Alex and demanding more explanation. How do we know this? How did we get from this step to the next? How would a reader know to use this substitution? Alex and I had relatively few disagreements while writing, but the ones we did have often revolved around something that seemed obvious to him but was not clear to me.
I also felt it was important to capture and share Alex’s enthusiasm for the unity of diverse disciplines. In the lectures our book was based on, the material really came alive when he provided the historical context for a discovery or told a fascinating story about an important person. It’s true that Alex can often get carried away, and even I ended up trimming many of the anecdotes, but I tried to strike a balance. A good example of this was the decision to put the biographies of mathematicians in shaded sidebars. We hope readers will be interested, but those who aren’t can skip them easily.
John: Having read both books, I found the styles markedly different and just as you described here. Again, thanks (from all of us) for persevering in asking those “dumb” questions of Alex.
Alex, you bravely said to me, in a personal correspondence prior to this interview, “…to ask any questions, however uncomfortable, and in any order you like.” No good deed goes unpunished: How — and please be brutally honest — would you contrast your experiences of co-authoring a book with Dan versus Paul?
Alex: The goals of the books were quite different. Both books have their place.
Paul convinced me that it is essential to write a very formal treatise in order to unambiguously define the foundations of generic programming. He thought that the conversational style of my lectures undermines the seriousness of the subject. I agree with him. We wrote a very terse, but — in my opinion — elegant book. Some people appreciate it; but it is clearly not for everyone. I am very grateful to Paul. He is and will always remain my close friend. I would gladly collaborate with him on another book.
Dan had a different idea. He thought that what was needed was a more accessible book, and I agree with him as well. And he led the work in such a direction. I would love to collaborate with him on another book as well.
Paul and Dan share many virtues. They both are very meticulous, hardworking, honest to a fault. Both of them are stubborn, and I was never able to intimidate them into accepting my point of view however hard I tried. I was very fortunate in both cases.
Wrap Up
John: Dan, Alex, what do you each foresee might be your next addition to the field of computer programming?
Alex: At this point I am actually getting ready for retirement relatively soon, and therefore am not thinking about tackling really large new challenges in the field. I still love to write code — in some respects I am a better programmer now than I ever was — but I am finding it harder to accommodate the constraints of a real-world working environment.
While I don’t expect to be personally involved in the future of programming, I hope that my last two books will serve as a foundation for a younger generation of programmers in their search for the right approach.
Dan: Most of my career has actually been focused on information retrieval — the part of computer science that underlies search engines. If I have made a contribution to the field of programming, it is by trying to bring Alex’s ideas to a wider audience. As for what comes next, that’s something I am currently exploring.
John: I want to thank you both for taking the time to give all of us these thoughtful answers to some fairly involved, but important, questions. I’m a better, more prepared professional for having read your book (same goes for EoP). What’s more, the world thanks you for the outstanding contributions you’ve made, and I join them in wishing you only the best in all your future endeavors.
Alex: You’re very welcome. We really appreciate the thought you put into your questions.
Dan: And thank you for the opportunity to share our thoughts.
John Lakos, author of Large Scale C++ Software Design, serves at Bloomberg LP in New York City as a senior architect and mentor for C++ Software Development world-wide. He is also an active voting member of the C++ Standards Committee, Library Working Group. Previously, Dr. Lakos directed the design and development of infrastructure libraries for proprietary analytic financial applications at Bear Stearns. For 12 years prior, Dr. Lakos developed large frameworks and advanced ICCAD applications at Mentor Graphics, for which he holds multiple software patents. His academic credentials include a Ph.D. in Computer Science ('97) and an Sc.D. in Electrical Engineering ('89) from Columbia University. Dr. Lakos received his undergraduate degrees from MIT in Mathematics ('82) and Computer Science ('81). His next book, entitled "Large-Scale C++, Volume I: Process and Architecture", is anticipated in 2015.