Knuth likes to include in those books [The Art of Computer Programming] as much recreational material as he can cram in.
— MARTIN GARDNER, Undiluted Hocus-Pocus (2013)
The second half of this book is devoted to Section 7.2.2.2, "Satisfiability," which addresses one of the most fundamental problems in all of computer science: Given a Boolean function, can its variables be set to at least one pattern of 0s and 1s that will make the function true? This problem arises so often, people have given it a nickname, 'SAT'.
Satisfiability might seem like an abstract exercise in understanding formal systems, but the truth is far different: Revolutionary methods for solving SAT problems emerged at the beginning of the twenty-first century, and they've led to game-changing applications in industry. These so-called "SAT solvers" can now routinely find solutions to practical problems that involve millions of variables and were thought until very recently to be hopelessly difficult.
Satisfiability is important chiefly because Boolean algebra is so versatile. Almost any problem can be formulated in terms of basic logical operations, and the formulation is particularly simple in a great many cases. Section 7.2.2.2 therefore begins with ten typical examples of widely different applications, and closes with detailed empirical results for a hundred different benchmarks. The great variety of these problems—all of which are special cases of SAT—is illustrated on pages 300 and 301 (which are my favorite pages in this book).
The story of satisfiability is the tale of a triumph of software engineering, blended with rich doses of beautiful mathematics. Section 7.2.2.2 explains how such a miracle occurred, by presenting complete details of seven SAT solvers, ranging from the small-footprint methods of Algorithms A and B to the industrial strength, state-of-the-art methods of Algorithms W, L, and C. (Well I have to hedge a little: New techniques are continually being discovered; hence SAT technology is ever-growing and the story is ongoing. But I do think that Algorithms W, L, and C compare reasonably well with the best algorithms of their class that were known in 2010. They're no longer at the cutting edge, but they still are amazingly good.)
Wow—Sections 7.2.2.1 and 7.2.2.2 have turned out to be the longest sections, by far, in The Art of Computer Programming—especially Section 7.2.2.2. The SAT problem is evidently a killer app, because it is key to the solution of so many other problems. Consequently I can only hope that my lengthy treatment does not also kill off my faithful readers! As I wrote this material, one topic always seemed to flow naturally into another, so there was no neat way to break either section up into separate subsections. (And anyway the format of TAOCP doesn't allow for a Section 7.2.2.1.3 or a Section 7.2.2.2.6.)
I've tried to ameliorate the reader's navigation problem by adding subheadings at the top of each right-hand page. Furthermore, as always, the exercises appear in an order that roughly parallels the order in which corresponding topics are taken up in the text. Numerous cross-references are provided between text, exercises, and illustrations, so that you have a fairly good chance of keeping in sync. I've also tried to make the index as comprehensive as possible.
Look, for example, at a "random" page—say page 264, which is part of the subsection about Monte Carlo algorithms. On that page you'll see that exercises 302, 303, 299, and 306 are mentioned. So you can guess that the main exercises about Monte Carlo algorithms are numbered in the early 300s. (Indeed, exercise 306 deals with the important special case of "Las Vegas algorithms"; and the next exercises explore a fascinating concept called "reluctant doubling.") This entire book is full of surprises and tie-ins to other aspects of computer science.
As in previous volumes, sections and subsections of the text are occasionally preceded by an asterisk (∗), meaning that the topics discussed there are "advanced" and skippable on a first reading. You might think that a 700-page book has probably been padded with peripheral material. But I constantly had to "cut, cut, cut" while writing it, because a great deal more is known! I found that new and potentially interesting-yet-unexplored topics kept popping up, more than enough to fill a lifetime; yet I knew that I must move on. So I hope that I've selected for treatment here a significant fraction of the concepts that will be the most important as time passes.
Every week I've been coming across fascinating new things that simply cry out to be part of The Art.
— DONALD E. KNUTH (2008)
Most of this book is self-contained, although there are frequent tie-ins with the topics discussed in previous volumes. Low-level details of machine language programming have already been covered extensively; so the algorithms in the present book are usually specified only at an abstract level, independent of any machine. However, some aspects of combinatorial programming are heavily dependent on low-level details that didn't arise before; in such cases, all examples in this book are based on the MMIX computer, which supersedes the MIX machine that was defined in early editions of Volume 1. Details about MMIX appear in a paperback supplement to that volume called The Art of Computer Programming, Volume 1, Fascicle 1, containing Sections 1.3.1´, 1.3.2´, etc.; they're also available on the Internet, together with downloadable assemblers and simulators.
Another downloadable resource, a collection of programs and data called The Stanford GraphBase, is cited extensively in the examples of this book. Readers are encouraged to play with it, in order to learn about combinatorial algorithms in what I think will be the most efficient and most enjoyable way.
I wrote nearly a thousand computer programs while preparing this material, because I find that I don't understand things unless I try to program them. Most of those programs were quite short, of course; but several of them are rather substantial, and possibly of interest to others. Therefore I've made a selection available by listing some of them on the following webpage: http://www-cs-faculty.stanford.edu/~knuth/programs.html
In particular you can download the programs DLX1, DLX2, DLX3, DLX5, DLX6, and DLX-PRE, which are the experimental versions of Algorithms X, C, M, C$, Z, and P, respectively, that were my constant companions while writing Section 7.2.2.1. Similarly, SAT0, SAT0W, SAT8, SAT9, SAT10, SAT11, SAT11K, SAT13 are the equivalents of Algorithms A, B, W, S, D, L, L, C, respectively, in Section 7.2.2.2. Such programs will be useful for solving many of the exercises, if you don't have access to other XCC solvers or SAT solvers. You can also download SATexamples.tgz from that page; it's a collection of programs that generate data for all 100 of the benchmark examples discussed in the text, and many more.
Several exercises involve the lists of English words that I've used in preparing examples. You'll need the data from http://www-cs-faculty.stanford.edu/~knuth/wordlists.tgz if you have the courage to work the exercises that use such lists.
Special Note: During the years that I've been preparing Volume 4, I've often run across basic techniques of probability theory that I would have put into Section 1.2 of Volume 1 if I'd been clairvoyant enough to anticipate them in the 1960s. Finally I realized that I ought to collect most of them together in one place, because the story of those developments is too interesting to be broken up into little pieces scattered here and there.
Therefore this book begins with a special tutorial and review of probability theory, in an unnumbered section entitled "Mathematical Preliminaries Redux." References to its equations and exercises use the abbreviation 'MPR'. (Think of the word "improvement.")
Incidentally, just after the special MPR section, Section 7.2.2 begins intentionally on a left-hand page; and its illustrations are numbered beginning with Fig. 68. The reason is that Section 7.2.1 ended in Volume 4A on a right-hand page, and its final illustration was Fig. 67. My editor has decided to treat Chapter 7 as a single unit, even though it is being split into several physical volumes.
Special thanks are due to Nikolai Beluhov, Armin Biere, Niklas E´en, Marijn Heule, Holger Hoos, Wei-Hwa Huang, Svante Janson, Ernst Schulte-Geers, George Sicherman, Filip Stappers, and Udo Wermuth, for their detailed comments on my early attempts at exposition, as well as to dozens and dozens of other correspondents who have contributed crucial corrections. My editor at Addison–Wesley, Mark Taub, has expertly shepherded this series of books into the 21st century; and Julie Nahil, as senior content producer, has meticulously ensured that the highest publication standards have continued to be maintained. Thanks also to Tomas Rokicki for keeping my Dell workstation in shipshape order, as well as to Stanford's InfoLab for providing extra computer power when that machine had reached its limits.
I happily offer a "finder's fee" of $2.56 for each error in this book when it is first reported to me, whether that error be typographical, technical, or historical. The same reward holds for items that I forgot to put in the index. And valuable suggestions for improvements to the text are worth 32/c each. (Furthermore, if you find a better solution to an exercise, I'll actually do my best to give you immortal glory, by publishing your name in subsequent printings:−)
Happy reading!
Stanford, California
D. E. K.
June 2022
(Complete preface in the book includes a note on references and a note on notations.)