Localizing Functionality with Lambda Expressions
Lambda expressions—one of the prominent features of C++11—are unnamed function-like blocks of code that you can insert wherever you can call a function. As such, lambda expressions are useful for writing compact predicates, in numeric computations, and initialization expressions. In this article, I present the syntax of lambda expressions and show how they're used.
Lambda Expressions Versus Function Objects
In many aspects, lambda expressions (lambdas for short) are similar to function objects. So why not use function objects in the first place? Technically, you can do that, but function objects require a lot of manual coding. You have to declare a class that includes a constructor, an overloaded function call operator (()), and data members; finally, you have to instantiate an object of that class. All this tedium is relegated to the compiler when you use lambdas. That's because the compiler conceptually transforms every lambda expression you write into a full-blown, anonymous function object known as a closure object.
Before we consider the syntactic properties of a lambda expression, let's look at an example. In this case, we'll use a lambda to locate the first employee whose salary is within a specified range:
vector <employee> emps {{"John", 1100.0}, {"Jane", 1100.0}}; const auto min_wage = 1000.0; const upper_lim = 1.5 * min_wage; std::find if(emps.begin(), emps.end(), [=](const employee& e) {return e.salary()>= min_wage && e.salary() < upper_lim;});
The code listing above uses the find_if() algorithm to locate the first employee whose salary is not lower than min_wage and not higher than upper_lim. (For additional information about the new style of initializing vectors, see my article "Get to Know the New C++11 Initialization Forms.")
Notice the last argument of find_if(). Traditionally, you would use a function object or a function pointer as the predicate (the third argument) of find_if(). In C++11, however, the following lambda expression can serve as the predicate instead:
[=](const employee& e) {return e.salary()>= min_wage && e.salary() < upper_lim;}
You can easily tell what the code between the braces does: It returns a bool value indicating whether a given employee's salary is within the specified range. Thus, instead of calling a function or instantiating a function object, the lambda expression computes the result locally.
Now let's look more closely at the syntax of a lambda expression.
Deciphering the Syntax
A lambda expression always begins with a lambda introducer, which consists of a pair of brackets ([]). A complete lambda expression looks like this:
[capture clause/*optional*/] (parameters) /*optional*/ -> return-type /*optional*/ {body}
The optional capture clause appears inside the introducer. (I'll discuss capture clauses shortly.) The parameter list can be omitted if the lambda takes no arguments. However, for the sake of code clarity, it's better to leave empty parentheses in that case. The return type is indicated by an arrow (->) and a type name. Finally, the lambda's body is enclosed within a pair of braces.
You can omit the return type if it can be inferred from the body, as is the case in our example. The compiler knows that the following expression evaluates to a bool value and therefore assumes that the lambda's return type is bool:
e.salary()>= min_wage && e.salary() < upper_lim;
However, if a lambda expression has multiple return statements, or if you want to make the return type explicit, you can always specify it like this:
[=](const employee& e)->bool {return e.salary()>= min_wage && e.salary() < upper_lim;}
Keep in mind that find_if() invokes its predicate function with an argument of the type *InputIterator. In our example, *InputIterator is an object of type employee. Hence, the parameter of the lambda expression is of type const employee&. The lambda's body can contain zero or more statements, as would an ordinary function's body. Typically, however, a lambda's body consists of a single statement, because a long body might obscure the code. In such cases, you'd rather use a named closure instead, as I'll show later.
Capture Clauses
It's time to talk about the capture clause, which is probably the part that's most difficult to understand. A lambda's parameter list defines the types of the arguments that the body can access directly, just as an ordinary function accesses its arguments. However, unlike an ordinary function, a lambda expression can also access variables that are not local to the body of the lambda; for example, variables from the enclosing scope. Such lambdas are said to have external references.
There are two ways to capture variables with external references: capture by value, and capture by reference. Since every lambda expression you write is transformed into a closure object, any external references are mapped to data members of that closure. By contrast, a lambda's parameters are mapped to the parameter list of the overloaded operator() of the closure object. An empty capture clause implies that no variables with external references are captured. That is, the corresponding closure will not contain data members that correspond to variables from the lambda's enclosing scope. The following lambda expression demonstrates that:
[](int n, int m) {return n*m;} //a lambda with no external references
There are two mechanisms for capturing variables with external references: capture by value, and capture by references. A variable captured by value is copied into a corresponding data member of the closure. A variable captured by reference becomes a reference data member of the closure that is bound to the non-local variable. C++11 lets you specify a default capture. The following default capture captures all of the variables in the lambda's enclosing scope by value:
[=]
To capture all of the variables in the lambda's enclosing scope by reference, use this default capture instead:
[&]
Alternatively, you can specify the capture mechanism for external references individually, like this:
int j; vector <double> vd; [=j, &vd] //j captured by copy, vd by ref
Here are a few examples that demonstrate how the capture clause handles external references in various situations:
vector<int> vi, vi2; [&vi](int i) {vi.push_back(i); }; // capture vi by ref [&] (int i) {vi.push_back(i); vi2.push_back(i) };//vi and vi2 by ref //vi by value [=vi]() {for_each(auto x=vi.begin(),x!=vi.end(),x++) {cout<<x<<" ";}};
Naming Closures
C++11 lets you define variables that store the closure objects associated with a given lambda. You use these variables later:
vector<int> vi; auto push_back_vi = [&vi](int i) { vi.push_back(i); };
The auto-declared variable push_back_vi has the type of the closure that the compiler will generate for the lambda expression [&vi](int i) { vi.push_back(i); }.
You can use push_back_vi to perform a push_back() operation on the vector vi like this:
push_back_vi(5); //invokes vi.push_back(5); push_back_vi(6)// invokes vi.push_back(6);
Notice that if you capture the vector by value, the push_back() operation would operate on a local copy of vi:
vector<int> vi; cout<<vi.size(); //0 auto push_back_vi = [=vi](int i) { vi.push_back(i); }; cout<<vi.size(); //still 0
Applications and Conclusions
Lambdas are useful wherever a small computational expression is needed, as an alternative to a full-blown function. Let's look at a concrete example.
Suppose you have a vector of integers containing 10 elements: 1,2...10. To calculate the factorial of the vector's elements, you can use the accumulate() algorithm, with a twist. The fourth argument of that algorithm is a callable entity—a lambda expression, for example. You can define a lambda expression that multiplies two numbers, store the closure in a named variable, and then use it as the fourth argument of accumulate():
vector<int> vi{1,2,3,4,5,6,7,8,9,10}; auto factorial = [](int i, int j) {return i * j;}; int val = accumulate(vi.begin(), vi.end(), 1, factorial); cout<<"10!="<<val<<endl; //3628800
How does it work? On every iteration, the factorial closure multiplies two values: the current vector element's value, and the value that it has accumulated thus far. The result is the factorial of the vector's elements. Without the lambda expression, you'd have to write a separate function:
inline int factorial (int i, int j) { return i*j; }
There's nothing wrong with using a function. Inline functions such as the above were the hottest OOP commodity of the late 1980s. They were replaced in the mid 1990s by function objects when generic programming and C++98 hit the streets.
Nowadays, lambdas are about to (at least partially) replace function objects, due to the rising popularity of functional programming. Therefore, it's advisable to familiarize yourself with lambdas, because they're already widely used in C++11 projects. Don't get me wrong—lambdas aren't just a fashionable accessory. Unlike inline functions and function objects, lambdas can be defined as direct arguments:
int val = std::accumulate(vi.begin(), vi.end(), 1, [](int i, int j) {return i * j;});
This style has two advantages:
- It eliminates the tedium of defining a function or function object.
- It's safer to use. Once you've defined the lambda locally, other parts of the program cannot call it, take its address, or hack it in any other way. In other words, lambdas are a means of localizing functionality as well as hiding functionality.
Danny Kalev is a certified system analyst and software engineer specializing in C, C++ Objective-C and other programming languages. He was a member of the C++ standards committee until 2000 and has since been involved informally in the C++ standardization process. Danny is the author of ANSI/ISO Professional Programmer's Handbook (1999) and The Informit C++ Reference Guide: Techniques, Insight, and Practical Advice on C++ (2005). He was also the Informit C++ Reference Guide. Danny earned an M.A. in linguistics, graduating summa cum laude, and is now pursuing his Ph.D. in applied linguistics.