Move Semantics in C++11, Part 2: Design and Implementation of Special Move Functions
Part 1 of this series presented the semantics of C++11 move operations and their effects on the source and target. This article concludes this series by focusing on the formal properties of moving, providing guidelines for designing move functions, and illustrating the standard C++ move features.
The State of Moved-from Variables
According to the C++11 standard, the state of a moved-from object is unspecified, albeit valid. This loose definition gives implementers enough leeway to do all sorts of things when moving. For example, a moved-from object might be left in an "empty" state, similar to a default constructed object. Another option is to leave the moved-from object in its pre-move state. The policy you should employ depends on the properties of the type and its usage. Let's look at moved-from objects that retain their original value.
When moving from fundamental types such as int and float, the source variable may retain its original value:
int m=7, n=0; n=m; //"move" m to n.
Seemingly, the programmer should set m to zero because a moved-from object is usually left in a default-initialized state. However, we can think of good reasons why m should be left intact:
- In many cases, a moved-from object is discarded immediately after the move operation. If it has no destructor, it's more efficient to leave the moved-from object intact.
- The C++11 requirement that a moved-from object has an unspecified state is satisfied even if m remains intact.
You therefore need to zero m only if you have a reason to do so. What might be a good reason for zeroing a variable after a move operation? Suppose you have two pointers p1 and p2 and you want to move p1 to p2. Leaving p1 intact could be disastrous because destructors and cleanup code assume that a non-null pointer value indicates resource ownership. This approach could lead to the infamous aliasing problem where two pointers hold the same address. To avoid aliasing, always set a moved-from pointer to nullptr:
int *p1= new int(5); int *p2=nullptr; //move p1 to p2 p2=p1; p1=nullptr; delete p2; //ok delete p1; //also ok; deleting a null pointer is harmless
Moving Class Objects
If a variable represents resource ownership, you should set it to its default value explicitly, even if it's not a pointer. Consider:
class String { size_t len; char * buf; bool empty; //.. };
Question: Which of the three data members in this example (len, buf, and empty) should be set to zero in a moved-from String? Answer: All three of them. Obviously, buf should be set to nullptr to avoid aliasing. Certain applications may assume that if len isn't zero or if empty isn't true then the String object still owns resources. Therefore, every value that indicates resource ownership should be set to zero—not just pointers. If you're not sure about the value of a data member, use the default constructor as your guide: The state of a moved-from object should be that of a default-initialized object.
Special Member Functions
The move constructors and move assignment operator are collectively known as move special member functions. They present a dilemma: Which of the two, if any, should your class define?
Under certain conditions, C++ implicitly declares the special member functions in a class that doesn't declare them explicitly. In C++11, this rule also applies to the two move special member functions. Thus, if the definition of class C doesn't explicitly declare a move assignment operator, one will be implicitly declared as defaulted only if all of the following conditions are met:
- C doesn't have a user-declared copy constructor.
- C doesn't have a user-declared move constructor.
- C doesn't have a user-declared copy assignment operator.
- C doesn't have a user-declared destructor.
- The move assignment operator would not be implicitly defined as deleted.
Consider the following class:
struct S { int n; S& operator=(const S&) = default; };
C++ will not implicitly declare a move assignment or a move constructor for class C in this case because C has a user-declared assignment operator. By contrast, class T in the following example will have both an implicitly declared move assignment operator and an implicitly declared move constructor:
class T { int n; public: T(); };
What does an implicitly defined move special member function do? This issue was under debate for a long time. However, the final resolution says that an implicitly defined move assignment operator performs a member-wise move, typically by calling std::move() (a new standard function that I'll discuss shortly).
C++11 Move Facilities
C++11 introduced several new facilities for moving. One of them is the static_cast expression:
static_cast<T&&>(t);
The static_cast expression converts the lvalue t to an rvalue reference of type T&&. This way, you can select which overloaded version of a function will be selected: f(T&&) or f(T&).
A class can define both move special member functions and copy special member functions. The designer of class String therefore may choose to define all six special member functions without risking ambiguity. The move special member function will be invoked when its argument is, or is convertible to String&&. Otherwise, the copy special member functions will be invoked.
Let's look at a concrete example of a legacy class that defines all four of the C++03 special member functions and that needs a C++11 facelift:
class String { public: // copy constructor String(const String& other) :len(other.len), buf(new char[other.len]) { std::cout << "In String's copy constructor. length = " << len << "." << std::endl; std::copy(other.buf, other.buf+len, buf); } // copy assignment operator String& operator=(const String& other) { std::cout << "In String's copy assignment operator. length = " <<other.len << "." << std::endl; if (this != &other) { delete[] buf; // free the current resource len = other.len; buf = new char[len]; std::copy(other.buf, other.buf + len, buf); } return *this; } explicit String(size_t len=0): len(len) { //allocate space, without filling it with data std::cout << "In String's constructor. length = " <<len << "." << std::endl; if (len>0) buf= new char [len]; else buf=nullptr; } ~String() { std::cout << "In String's destructor. length was" << len << "." << std::endl; delete[] buf; //even if buff is 0, this delete[] is safe len=0; } private: size_t len; char* buf; };
When implementing move special member functions, adhere to the following guideline:
- The target object must release its resources (if any) before acquiring new resources.
- Acquire the resources.
- Set the source's data members to their default values.
Here's how String's move constructor looks like (the cout statements will enable you to see which special member function is called):
String::String(String&& other) //the canonical signature of a move constructor //first, set the target members to their default values : buf(nullptr) , len(0) { std::cout << "In String's move constructor. length = " <<other.len << "." << std::endl; //next, pilfer the source's resources buf=other.buf; len=other.len; //finally, set the source's data members to default values to prevent aliasing other.buf = nullptr; other.len=0; }
In a similar vein, the move assignment operator looks like this:
String& String::operator=(String && other) //the canonical signature of a move assignment operator { std::cout << "In String's move assignment operator. length = " <<other.len << "." << std::endl; if (this != &other) { // first, release the existing resource delete[] buf; //next, pilfer the source's resources buf=other.buf; len=other.len; //finally, set the source's data members to default values to prevent aliasing other.buf=nullptr; other.len=0; } return *this; }
Overloading with rvalue References
The C++11 overloading rules recognize rvalue references. In addition, many Standard Library components now define overloaded sets of functions that take either T& or T&& as their parameters. If the argument is an rvalue, the T&& overloaded function will be selected. Otherwise, the const T& overloaded function will be selected:
#include <vector> using namespace std; int main() { vector<String> vs; vs.push_back(String(25)); //the string's size lets you track specific objects v.push_back(String(50)); // insert a new element into the second position v.insert(v.begin() + 1, String(60)); }
When executing the program above, a C++11-compliant implementation will invoke only the move special member functions of String because in all three function calls the argument is an rvalue. More specifically it's an xvalue; that is, a temporary. (The x is a hint that the object is about to expire shortly after its creation.) xvalues bind to rvalue references. However, if you use an lvalue argument, push_back(const T&) will be selected instead:
vector<String> vs; String s(15); vs.push_back(s); //uses String's copy constructor
Even with lvalue arguments, you can force the implementation to select the move special member functions. For this purpose, either cast the lvalue to T&& using static_cast, or use the standard function std::move() to do the same thing with fewer keystrokes:
vector<String> vs; String s1(15), s2(20); vs.push_back(static_cast<String&&>(s1)); //uses String's move constructor vs.push_back(std::move(s2)); //uses String's move constructor
Conclusion
The move special member functions are significantly faster than their copy counterparts. Therefore, object copying can be replaced by moving in the following cases:
- You're populating a container with temporaries.
- A function is returning an object by value.
- You're passing an argument by value.
The performance benefits of move semantics are one of the prominent bonuses of using C++11. If you're designing new classes, consider adding the move special member functions to them. Some of your legacy classes probably could also benefit from a C++11 facelift that includes those special member functions.