- Collection Interfaces
- Concrete Collections
- The Collections Framework
- Algorithms
- Legacy Collections
The Collections Framework
A framework is a set of classes that form the basis for building advanced functionality. A framework contains superclasses with useful functionality, policies, and mechanisms. The user of a framework forms subclasses to extend the functionality without having to reinvent the basic mechanisms. For example, Swing is a framework for user interfaces.
The Java collections library forms a framework for collection classes. It defines a number of interfaces and abstract classes for implementors of collections (see Figure 2-10), and it prescribes certain mechanisms, such as the iteration protocol. You can use the collection classes without having to know much about the framework—we did just that in the preceding sections. However, if you want to implement generic algorithms that work for multiple collection types or if you want to add a new collection type, it is helpful to understand the framework.
Figure 2-10 The interfaces of the collections framework
There are two fundamental interfaces for collections: Collection and Map. You insert elements into a collection with a method:
boolean add(E element)
However, maps hold key/value pairs, and you use the put method to insert them.
V put(K key, V value)
To read elements from a collection, you visit them with an iterator. However, you can read values from a map with the get method:
V get(K key)
A List is an ordered collection. Elements are added into a particular position in the container. An object can be placed into its position in two ways: by an integer index and by a list iterator. The List interface defines methods for random access:
void add(int index, E element) E get(int index) void remove(int index)
As already discussed, the List interface provides these random access methods whether or not they are efficient for a particular implementation. To make it possible to avoid carrying out costly random access operations, JDK 1.4 introduces a tagging interface, RandomAccess. That interface has no methods, but you can use it to test whether a particular collection supports efficient random access:
if (c instanceof RandomAccess)
{
use random access algorithm
}
else
{
use sequential access algorithm
}
The ArrayList and Vector classes implement the RandomAccess interface.
From a theoretical point of view, it would have made sense to have a separate Array interface that extends the List interface and declares the random access methods. If there were a separate Array interface, then those algorithms that require random access would use Array parameters and you could not accidentally apply them to collections with slow random access. However, the designers of the collections framework chose not to define a separate interface, because they wanted to keep the number of interfaces in the library small. Also, they did not want to take a paternalistic attitude toward programmers. You are free to pass a linked list to algorithms that use random access—you just need to be aware of the performance costs.
The ListIterator interface defines a method for adding an element before the iterator position:
void add(E element)
To get and remove elements at a particular position, you simply use the next and remove methods of the Iterator interface.
The Set interface is identical to the Collection interface, but the behavior of the methods is more tightly defined. The add method of a set should reject duplicates. The equals method of a set should be defined so that two sets are identical if they have the same elements, but not necessarily in the same order. The hashCode method should be defined such that two sets with the same elements yield the same hash code.
Why make a separate interface if the method signatures are the same? Conceptually, not all collections are sets. Making a Set interface enables programmers to write methods that accept only sets.
Finally, the SortedSet and SortedMap interfaces expose the comparison object used for sorting, and they define methods to obtain views of subsets of the collections. We discuss these views in the next section.
Now, let us turn from the interfaces to the classes that implement them. We already discussed that the collection interfaces have quite a few methods that can be trivially implemented from more fundamental methods. Abstract classes supply many of these routine implementations:
AbstractCollection AbstractList AbstractSequentialList AbstractSet AbstractQueue AbstractMap
If you implement your own collection class, then you probably want to extend one of these classes so that you can pick up the implementations of the routine operations.
The Java library supplies concrete classes:
LinkedList ArrayList HashSet TreeSet PriorityQueue HashMap TreeMap
Figure 2-11 shows the relationships between these classes.
Figure 2-11 Classes in the collections framework
Finally, a number of “legacy” container classes have been present since JDK 1.0, before there was a collections framework:
Vector Stack Hashtable Properties
They have been integrated into the collections framework—see Figure 2-12. We discuss these classes later in this chapter.
Figure 2-12 Legacy classes in the collections framework
Views and Wrappers
If you look at Figure 2-10 and Figure 2-11, you might think it is overkill to have lots of interfaces and abstract classes to implement a modest number of concrete collection classes. However, these figures don't tell the whole story. By using views, you can obtain other objects that implement the Collection or Map interfaces. You saw one example of this with the keySet method of the map classes. At first glance, it appears as if the method creates a new set, fills it with all keys of the map, and returns it. However, that is not the case. Instead, the keySet method returns an object of a class that implements the Set interface and whose methods manipulate the original map. Such a collection is called a view.
The technique of views has a number of useful applications in the collections framework. We discuss these applications in the following sections.
Lightweight Collection Wrappers
The static asList method of the Arrays class returns a List wrapper around a plain Java array. This method lets you pass the array to a method that expects a list or collection argument. For example,
Card[] cardDeck = new Card[52]; . . . List<Card> cardList = Arrays.asList(cardDeck);
The returned object is not an ArrayList. It is a view object with get and set methods that access the underlying array. All methods that would change the size of the array (such as add and the remove method of the associated iterator) throw an UnsupportedOperationException.
As of JDK 5.0, the asList method is declared to have a variable number of arguments. Instead of passing an array, you can also pass individual elements. For example,
List<String> names = Arrays.asList("Amy", "Bob", "Carl");
The method call
Collections.nCopies(n, anObject)
returns an immutable object that implements the List interface and gives the illusion of having n elements, each of which appears as anObject.
For example, the following call creates a List containing 100 strings, all set to "DEFAULT":
List<String> settings = Collections.nCopies(100, "DEFAULT");
There is very little storage cost—the object is stored only once. This is a cute application of the view technique.
The Collections class contains a number of utility methods with parameters or return values that are collections. Do not confuse it with the Collection interface.
The method call
Collections.singleton(anObject)
returns a view object that implements the Set interface (unlike ncopies, which produces a List). The returned object implements an immutable single-element set without the overhead of data structure. The methods singletonList and singletonMap behave similarly.
Subranges
You can form subrange views for a number of collections. For example, suppose you have a list staff and want to extract elements 10 to 19. You use the subList method to obtain a view into the subrange of the list.
List group2 = staff.subList(10, 20);
The first index is inclusive, the second exclusive—just like the parameters for the substring operation of the String class.
You can apply any operations to the subrange, and they automatically reflect the entire list. For example, you can erase the entire subrange:
group2.clear(); // staff reduction
The elements are now automatically cleared from the staff list, and group2 is empty.
For sorted sets and maps, you use the sort order, not the element position, to form subranges. The SortedSet interface declares three methods:
subSet(from, to) headSet(to) tailSet(from)
These return the subsets of all elements that are larger than or equal to from and strictly smaller than to. For sorted maps, the similar methods
subMap(from, to) headMap(to) tailMap(from)
return views into the maps consisting of all entries in which the keys fall into the specified ranges.
Unmodifiable Views
The Collections class has methods that produce unmodifiable views of collections. These views add a runtime check to an existing collection. If an attempt to modify the collection is detected, then an exception is thrown and the collection remains untouched.
You obtain unmodifiable views by six methods:
Collections.unmodifiableCollection Collections.unmodifiableList Collections.unmodifiableSet Collections.unmodifiableSortedSet Collections.unmodifiableMap Collections.unmodifiableSortedMap
Each method is defined to work on an interface. For example, Collections.unmodifiableList works with an ArrayList, a LinkedList, or any other class that implements the List interface.
For example, suppose you want to let some part of your code look at, but not touch, the contents of a collection. Here is what you could do:
List<String> staff = new LinkedList<String>(); . . . lookAt(new Collections.unmodifiableList(staff));
The Collections.unmodifiableList method returns an object of a class implementing the List interface. Its accessor methods retrieve values from the staff collection. Of course, the lookAt method can call all methods of the List interface, not just the accessors. But all mutator methods (such as add) have been redefined to throw an UnsupportedOperationException instead of forwarding the call to the underlying collection.
The unmodifiable view does not make the collection itself immutable. You can still modify the collection through its original reference (staff, in our case). And you can still call mutator methods on the elements of the collection.
Because the views wrap the interface and not the actual collection object, you only have access to those methods that are defined in the interface. For example, the LinkedList class has convenience methods, addFirst and addLast, that are not part of the List interface. These methods are not accessible through the unmodifiable view.
The unmodifiableCollection method (as well as the synchronizedCollection and checkedCollection methods discussed later in this section) returns a collection whose equals method does not invoke the equals method of the underlying collection. Instead, it inherits the equals method of the Object class, which just tests whether the objects are identical. If you turn a set or list into just a collection, you can no longer test for equal contents. The view acts in this way because equality testing is not well defined at this level of the hierarchy. The views treat the hashCode method in the same way.
However, the unmodifiableSet and unmodifiableList class do not hide the equals and hashCode methods of the underlying collections.
Synchronized Views
If you access a collection from multiple threads, you need to ensure that the collection is not accidentally damaged. For example, it would be disastrous if one thread tried to add to a hash table while another thread was rehashing the elements.
Instead of implementing thread-safe collection classes, the library designers used the view mechanism to make regular collections thread safe. For example, the static synchronizedMap method in the Collections class can turn any map into a Map with synchronized access methods:
HashMap<String, Employee> hashMap = new HashMap<String, Employee>(); Map<String, Employee> map = Collections.synchronizedMap(hashMap);
You can now access the map object from multiple threads. The methods such as get and put are serialized—each method call must be finished completely before another thread can call another method.
You should make sure that no thread accesses the data structure through the original unsynchronized methods. The easiest way to ensure this is not to save any reference to the original object:
map = Collections.synchronizedMap(new HashMap<String, Employee>());
Note that the views only serialize the methods of the collection. If you use an iterator, you need to manually acquire the lock on the collection object. For example,
synchronized (map) { Iterator<String> iter = map.keySet().iterator(); while (iter.hasNext()) . . . }
You must use the same code if you use a “for each” loop since the loop uses an iterator. Note that the iterator will actually fail with a ConcurrentModificationException if another thread modifies the collection while the iteration is in progress. The synchronization is still required so that the concurrent modification can be reliably detected.
As a practical matter, the synchronization wrappers have limited utility. You are usually better off using the collections defined in the java.util.concurrent package—see Chapter 1 for more information. In particular, the ConcurrentHashMap map has been carefully implemented so that multiple threads can access it without blocking each other, provided they access different buckets.
Checked Views
JDK 5.0 adds a set of “checked” views that are intended as debugging support for a problem that can occur with generic types. As explained in Volume 1, Chapter 13, it is actually possible to smuggle elements of the wrong type into a generic collection. For example,
ArrayList<String> strings = new ArrayList<String>(); ArrayList rawList = strings; // get warning only, not an error, for compatibility with legacy code rawList.add(new Date()); // now strings contains a Date object!
The erroneous add command is not detected at run time. Instead, a class cast exception will happen later when another part of the code calls get and casts the result to a String.
A checked view can detect this problem. Define
List<String> safeStrings = Collections.checkedList(strings, String.class);
The view's add method checks that the inserted object belongs to the given class and immediately throws a ClassCastException if it does not. The advantage is that the error is reported at the correct location:
ArrayList rawList = safeStrings; rawList.add(new Date()); // Checked list throws a ClassCastException
The checked views are limited by the runtime checks that the virtual machine can carry out. For example, if you have an ArrayList<Pair<String>>, you cannot protect it from inserting a Pair<Date> since the virtual machine has a single “raw” Pair class.
A Note on Optional Operations
A view usually has some restriction—it may be read-only, it may not be able to change the size, or it may support removal, but not insertion, as is the case for the key view of a map. A restricted view throws an UnsupportedOperationException if you attempt an inappropriate operation.
In the API documentation for the collection and iterator interfaces, many methods are described as “optional operations.” This seems to be in conflict with the notion of an interface. After all, isn't the purpose of an interface to lay out the methods that a class must implement? Indeed, this arrangement is unsatisfactory from a theoretical perspective. A better solution might have been to design separate interfaces for read-only views and views that can't change the size of a collection. However, that would have tripled the number of interfaces, which the designers of the library found unacceptable.
Should you extend the technique of “optional” methods to your own designs? We think not. Even though collections are used frequently, the coding style for implementing them is not typical for other problem domains. The designers of a collection class library have to resolve a particularly brutal set of conflicting requirements. Users want the library to be easy to learn, convenient to use, completely generic, idiot proof, and at the same time as efficient as hand-coded algorithms. It is plainly impossible to achieve all these goals simultaneously, or even to come close. But in your own programming problems, you will rarely encounter such an extreme set of constraints. You should be able to find solutions that do not rely on the extreme measure of “optional” interface operations.
java.util.Collections 1.2
- static <E> Collection unmodifiableCollection(Collection<E> c)
- static <E> List unmodifiableList(List<E> c)
- static <E> Set unmodifiableSet(Set<E> c)
- static <E> SortedSet unmodifiableSortedSet(SortedSet<E> c)
- static <K, V> Map unmodifiableMap(Map<K, V> c)
- static <K, V> SortedMap unmodifiableSortedMap(SortedMap<K, V> c)
construct a view of the collection whose mutator methods throw an UnsupportedOperationException.
- static <E> Collection<E> synchronizedCollection(Collection<E> c)
- static <E> List synchronizedList(List<E> c)
- static <E> Set synchronizedSet(Set<E> c)
- static <E> SortedSet synchronizedSortedSet(SortedSet<E> c)
- static <K, V> Map<K, V> synchronizedMap(Map<K, V> c)
- static <K, V> SortedMap<K, V> synchronizedSortedMap(SortedMap<K, V> c)
construct a view of the collection whose methods are synchronized.
- static <E> Collection checkedCollection(Collection<E> c, Class<E> elementType)
- static <E> List checkedList(List<E> c, Class<E> elementType)
- static <E> Set checkedSet(Set<E> c, Class<E> elementType)
- static <E> SortedSet checkedSortedSet(SortedSet<E> c, Class<E> elementType)
- static <K, V> Map checkedMap(Map<K, V> c, Class<K> keyType, Class<V> valueType)
- static <K, V> SortedMap checkedSortedMap(SortedMap<K, V> c, Class<K> keyType,
- Class<V> valueType)
construct a view of the collection whose methods throw a ClassCastException if an element of the wrong type is inserted.
- static <E> List<E> nCopies(int n, E value)
- static <E> Set<E> singleton(E value)
construct a view of the object as either an unmodifiable list with n identical elements, or a set with a single element.
java.util.Arrays 1.2
- static <E> List<E> asList(E... array)
returns a list view of the elements in an array that is modifiable but not resizable.
java.util.List<E> 1.2
- List<E> subList(int firstIncluded, int firstExcluded)
returns a list view of the elements within a range of positions.
java.util.SortedSet<E> 1.2
- SortedSet<E> subSet(E firstIncluded, E firstExcluded)
- SortedSet<E> headSet(E firstExcluded)
- SortedSet<E> tailSet(E firstIncluded)
return a view of the elements within a range.
java.util.SortedMap<K, V> 1.2
- SortedMap<K, V> subMap(K firstIncluded, K firstExcluded)
- SortedMap<K, V> headMap(K firstExcluded)
- SortedMap<K, V> tailMap(K firstIncluded)
return a map view of the entries whose keys are within a range.
Bulk Operations
So far, most of our examples used an iterator to traverse a collection, one element at a time. However, you can often avoid iteration by using one of the bulk operations in the library.
Suppose you want to find the intersection of two sets, the elements that two sets have in common. First, make a new set to hold the result.
Set<String> result = new HashSet<String>(a);
Here, you use the fact that every collection has a constructor whose parameter is another collection that holds the initialization values.
Now, use the retainAll method:
result.retainAll(b);
It retains all elements that also happen to be in b. You have formed the intersection without programming a loop.
You can carry this idea further and apply a bulk operation to a view. For example, suppose you have a map that maps employee IDs to employee objects and you have a set of the IDs of all employees that are to be terminated.
Map<String, Employee> staffMap = . . .; Set<String> terminatedIDs = . . .;
Simply form the key set and remove all IDs of terminated employees.
staffMap.keySet().removeAll(terminatedIDs);
Because the key set is a view into the map, the keys and associated employee names are automatically removed from the map.
By using a subrange view, you can restrict bulk operations to sublists and subsets. For example, suppose you want to add the first 10 elements of a list to another container. Form a sublist to pick out the first 10:
relocated.addAll(staff.subList(0, 10));
The subrange can also be a target of a mutating operation.
staff.subList(0, 10).clear();
Converting Between Collections and Arrays
Because large portions of the Java platform API were designed before the collections framework was created, you occasionally need to translate between traditional arrays and the more modern collections.
If you have an array, you need to turn it into a collection. The Arrays.asList wrapper serves this purpose. For example,
String[] values = . . .; HashSet<String> staff = new HashSet<String>(Arrays.asList(values));
Obtaining an array from a collection is a bit trickier. Of course, you can use the toArray method:
Object[] values = staff.toArray();
But the result is an array of objects. Even if you know that your collection contained objects of a specific type, you cannot use a cast:
String[] values = (String[]) staff.toArray(); // Error!
The array returned by the toArray method was created as an Object[] array, and you cannot change its type. Instead, you use a variant of the toArray method. Give it an array of length 0 of the type that you'd like. The returned array is then created as the same array type:
String[] values = staff.toArray(new String[0]);
If you like, you can construct the array to have the correct size:
staff.toArray(new String[staff.size()]);
In this case, no new array is created.
You may wonder why you don't simply pass a Class object (such as String.class) to the toArray method. However, this method does “double duty,” both to fill an existing array (provided it is long enough) and to create a new array.
java.util.Collection<E> 1.2
- <T> T[] toArray(T[] array)
checks whether the array parameter is larger than the size of the collection. If so, it adds all elements of the collection into the array, followed by a null terminator, and it returns the array. If the length of array equals the size of the collection, then the method adds all elements of the collection to the array but does not add a null terminator. If there isn't enough room, then the method creates a new array, of the same type as array, and fills it with the elements of the collection.
Extending the Framework
The collections framework contains all data structures that most programmers will ever need. However, if you do need a specialized data structure, you can easily extend the framework. As an example, we implement a circular array queue—see Example 2-6.
The framework contains a class AbstractQueue that implements all methods of the Queue interface except for size, offer, poll, peek, and iterator. We make use of that class and only implement the missing methods. We then automatically inherit all the remaining methods from the AbstractQueue class.
Most of the methods are straightforward, with the exception of the iterator. We implement the queue iterator as an inner class. This enables the iterator methods to access the fields of the enclosing queue object, that is, the object that constructed the iterator. (As we discussed in Volume 1, Chapter 5, every object of a non-static inner class has a reference to the outer class object that created it.)
Also note the protection against concurrent modification. The queue keeps a count of all modifications. When the iterator is constructed, it makes a copy of that count. Whenever the iterator is used, it checks that the counts still match. If not, it throws a ConcurrentModificationException.
Example 2-6. CircularArrayQueueTest.java
1. import java.util.*; 2. 3. public class CircularArrayQueueTest 4. { 5. public static void main(String[] args) 6. { 7. Queue<String> q = new CircularArrayQueue<String>(5); 8. q.add("Amy"); 9. q.add("Bob"); 10. q.add("Carl"); 11. q.add("Deedee"); 12. q.add("Emile"); 13. q.remove(); 14. q.add("Fifi"); 15. q.remove(); 16. for (String s : q) System.out.println(s); 17. } 18. } 19. 20. /** 21. A first-in, first-out bounded collection. 22. */ 23. class CircularArrayQueue<E> extends AbstractQueue<E> 24. { 25. /** 26. Constructs an empty queue. 27. @param capacity the maximum capacity of the queue 28. */ 29. public CircularArrayQueue(int capacity) 30. { 31. elements = (E[]) new Object[capacity]; 32. count = 0; 33. head = 0; 34. tail = 0; 35. } 36. 37. public boolean offer(E newElement) 38. { 39. assert newElement != null; 40. if (count < elements.length) 41. { 42. elements[tail] = newElement; 43. tail = (tail + 1) % elements.length; 44. count++; 45. modcount++; 46. return true; 47. } 48. else 49. return false; 50. } 51. 52. public E poll() 53. { 54. if (count == 0) return null; 55. E r = elements[head]; 56. head = (head + 1) % elements.length; 57. count--; 58. modcount++; 59. return r; 60. } 61. 62. public E peek() 63. { 64. if (count == 0) return null; 65. return elements[head]; 66. } 67. 68. public int size() 69. { 70. return count; 71. } 72. 73. public Iterator<E> iterator() 74. { 75. return new QueueIterator(); 76. 77. } 78. 79. private class QueueIterator implements Iterator<E> 80. { 81. public QueueIterator() 82. { 83. modcountAtConstruction = modcount; 84. } 85. 86. public E next() 87. { 88. if (!hasNext()) throw new NoSuchElementException(); 89. E r = elements[(head + offset) % elements.length]; 90. offset++; 91. return r; 92. } 93. 94. public boolean hasNext() 95. { 96. if (modcount != modcountAtConstruction) 97. throw new ConcurrentModificationException(); 98. return offset < elements.length; 99. } 100. 101. public void remove() 102. { 103. throw new UnsupportedOperationException(); 104. } 105. 106. private int offset; 107. private int modcountAtConstruction; 108. } 109. 110. private E[] elements; 111. private int head; 112. private int tail; 113. private int count; 114. private int modcount; 115. }