3.7 Containers
Much computing involves creating collections of various forms of objects and then manipulating such collections. Reading characters into a string and printing out the string is a simple example. A class with the main purpose of holding objects is commonly called a container. Providing suitable containers for a given task and supporting them with useful fundamental operations are important steps in the construction of any program.
To illustrate the standard library's most useful containers, consider a simple program for keeping names and telephone numbers. This is the kind of program for which different approaches appear ''simple and obvious'' to people of different backgrounds.
3.7.1 Vector
For many C programmers, a built-in array of (name,number) pairs would seem to be a suitable starting point:
struct Entry { string name; int number; }; Entry phone_book[1000] ; void print_entry(int i) // simple use { cout << phone_book[i].name << ' ' << phone_book[i].number << '\n'; }
However, a built-in array has a fixed size. If we choose a large size, we waste space; if we choose a smaller size, the array will overflow. In either case, we will have to write low-level memory-management code. The standard library provides a vector that takes care of that:
vector<Entry> phone_book(1000) ; void print_entry(int i) // simple use, exactly as for array { cout << phone_book[i].name << ' ' << phone_book[i].number << '\n'; } void add_entries(int n) // increase size by n { phone_book.resize(phone_book.size()+n) ; }
The vector member function size() gives the number of elements.
Note the use of parentheses in the definition of phone_book. We made a single object of type vector<Entry> and supplied its initial size as an initializer. This is very different from declaring a built-in array:
vector<Entry> book(1000) ; // vector of 1000 elements vector<Entry> books[1000] ; // 1000 empty vectors
If you make the mistake of using [] where you meant () when declaring a vector, your compiler will almost certainly catch the mistake and issue an error message when you try to use the vector.
A vector is a single object that can be assigned. For example:
void f(vector<Entry>& v) { vector<Entry> v2 = phone_book; v = v2; // ... }
Assigning a vector involves copying its elements. Thus, after the initialization and assignment in f(), v and v2 each holds a separate copy of every Entry in the phone book. When a vector holds many elements, such innocent-looking assignments and initializations can be prohibitively expensive. Where copying is undesirable, references or pointers should be used.
3.7.2 Range Checking
The standard library vector does not provide range checking by default. For example:
void f() { int i = phone_book[1001].number; // 1001 is out of range // ... }
The initialization is likely to place some random value in i rather than giving an error. This is undesirable, so I will use a simple range-checking adaptation of vector, called Vec. A Vec is like a vector, except that it throws an exception of type out_of_range if a subscript is out of range.
Techniques for implementing types such as Vec and for using exceptions effectively are discussed elsewhere in this book. However, the definition here is sufficient for the examples in this book:
template<class T> class Vec : public vector<T> { public: Vec() : vector<T>() { } Vec(int s) : vector<T>(s) { } T& operator[](int i) { return at(i) ; } // range-checked const T& operator[](int i) const { return at(i) ; } // range-checked };
The at() operation is a vector subscript operation that throws an exception of type out_of_range if its argument is out of the vector's range. If necessary, it is possible to prevent access to the vector<T> base.
Returning to the problem of keeping names and telephone numbers, we can now use a Vec to ensure that out-of-range accesses are caught. For example:
Vec<Entry> phone_book(1000) ; void print_entry(int i) // simple use, exactly as for vector { cout << phone_book[i].name << ' ' << phone_book[i].number << '\n'; }
An out-of-range access will throw an exception that the user can catch. For example:
void f() { try { for (int i = 0; i<10000; i++) print_entry(i) ; } catch (out_of_range) { cout << "range error\n"; } }
The exception will be thrown, and then caught, when phone_book[i] is tried with i==1000. If the user doesn't catch this kind of exception, the program will terminate in a well-defined manner rather than proceeding or failing in an undefined manner. One way to minimize surprises from exceptions is to use a main() with a try-block as its body:
int main() try { // your code } catch (out_of_range) { cerr << "range error\n"; } catch (...) { cerr << "unknown exception thrown\n"; }
This provides default exception handlers so that if we fail to catch some exception, an error message is printed on the standard error-diagnostic output stream cerr.
3.7.3 List
Insertion and deletion of phone book entries could be common. Therefore, a list could be more appropriate than a vector for representing a simple phone book. For example:
list<Entry> phone_book;
When we use a list, we tend not to access elements using subscripting the way we commonly do for vectors. Instead, we might search the list looking for an element with a given value. To do this, we take advantage of the fact that a list is a sequence as described in §3.8:
void print_entry(const string& s) { typedef list<Entry>: :const_iterator LI; for (LI i = phone_book.begin() ; i != phone_book.end() ; ++i) { const Entry& e = *i; // reference used as shorthand if (s == e.name) { cout << e.name << ' ' << e.number << '\n'; return; } } }
The search for s starts at the beginning of the list and proceeds until either s is found or the end is reached. Every standard library container provides the functions begin() and end(), which return an iterator to the first and to one-past-the-last element, respectively. Given an iterator i, ++i advances i to refer to the next element. Given an iterator i, the element it refers to is *i.
A user need not know the exact type of the iterator for a standard container. That iterator type is part of the definition of the container and can be referred to by name. When we don't need to modify an element of the container, const_iterator is the type we want. Otherwise, we use the plain iterator type.
Adding elements to a list and removing elements from a list is easy:
void f(const Entry& e, list<Entry>: :iterator i, list<Entry>: :iterator p) { phone_book.push_front(e) ; // add at beginning phone_book.push_back(e) ; // add at end phone_book.insert(i,e) ; // add before the element referred to by 'i' phone_book.erase(p) ; // remove the element referred to by 'p' }
3.7.4 Map
Writing code to look up a name in a list of (name,number) pairs is really quite tedious. In addition, a linear search is quite inefficient for all but the shortest lists. Other data structures directly support insertion, deletion, and searching based on values. In particular, the standard library provides the map type. A map is a container of pairs of values. For example:
map<string,int> phone_book;
In other contexts, a map is known as an associative array or a dictionary.
When indexed by a value of its first type (called the key) a map returns the corresponding value of the second type (called the value or the mapped type). For example:
void print_entry(const string& s) { if (int i = phone_book[s]) cout << s << ' ' << i << '\n'; }
If no match was found for the key s, a default value is returned from the phone_book. The default value for an integer type in a map is 0. Here, I assume that 0 isn't a valid telephone number.
3.7.5 Standard Containers
A map, a list, and a vector can each be used to represent a phone book. However, each has strengths and weaknesses. For example, subscripting a vector is cheap and easy. On the other hand, inserting an element between two elements tends to be expensive. A list has exactly the opposite properties. A map resembles a list of (key,value) pairs except that it is optimized for finding values based on keys.
The standard library provides some of the most general and useful container types to allow the programmer to select a container that best serves the needs of an application:
Standard Container Summary
vector_T_ |
A variable-sized vector |
list_T_ |
A doubly-linked list |
queue_T_ |
A queue |
stack_T_ |
A stack |
deque_T_ |
A double-ended queue |
priority_queue_T_ |
A queue sorted by value |
set_T_ |
A set |
multiset_T_ |
A set in which a value can occur many times |
map_key,val_ |
An associative array |
multimap_key,val_ |
A map in which a key can occur many times |
The standard containers are defined in namespace std and presented in headers <vector>, <list>, <map>, etc.
The standard containers and their basic operations are designed to be similar from a notational point of view. Furthermore, the meanings of the operations are equivalent for the various containers. In general, basic operations apply to every kind of container. For example, push_back() can be used (reasonably efficiently) to add elements to the end of a vector as well as for a list, and every container has a size() member function that returns its number of elements.
This notational and semantic uniformity enables programmers to provide new container types that can be used in a very similar manner to the standard ones. The range-checked vector, Vec (§3.7.2), is an example of that. The uniformity of container interfaces also allows us to specify algorithms independently of individual container types.