Streams

By Khalid A. Mughal and Vasily A. Strelnikov
Jul 3, 2023

📄 Contents

␡

⎙ Print

< Back Page 8 of 10 Next >

This chapter is from the book 

OCP Oracle Certified Professional Java SE 17 Developer (Exam 1Z0-829) Programmer's Guide

Learn More Buy

16.8 Collectors

A collector encapsulates the functions required for performing reduction: the supplier, the accumulator, the combiner, and the finisher. It can provide these functions since it implements the Collector interface (in the java.util.stream package) that defines the methods to create these functions. It is passed as an argument to the collect(Collector) method in order to perform a reduction operation. In contrast, the collect(Supplier, BiConsumer, BiConsumer) method requires the functions supplier, accumulator, and combiner, respectively, to be passed as arguments in the method call.

Details of implementing a collector are not necessary for our purposes, as we will exclusively use the extensive set of predefined collectors provided by the static factory methods of the Collectors class in the java.util.stream package (Table 16.7, p. 1005). In most cases, it should be possible to find a predefined collector for the task at hand. The collectors use various kinds of containers for performing reduction— for example, accumulating to a map, or finding the minimum or maximum element. For example, the Collectors.toList() factory method creates a collector that performs mutable reduction using a list as a mutable container. It can be passed to the collect(Collector) terminal operation of a stream.

It is a common practice to import the static factory methods of the Collectors class in the code so that the methods can be called by their simple names.

import static java.util.stream.Collectors.*;

However, the practice adopted in this chapter is to assume that only the Collectors class is imported, enforcing the connection between the static methods and the class to be done explicitly in the code. Of course, static import of factory methods can be used once familiarity with the collectors is established.

import java.util.stream.Collectors;

The three-argument collect() method is primarily used to implement mutable reduction, whereas the Collectors class provides collectors for both functional and mutable reduction that can be either used in a stand-alone capacity or composed with other collectors.

One group of collectors is designed to collect to a predetermined container, which is evident from the name of the static factory method that creates it: toCollection, toList, toSet, and toMap (p. 979). The overloaded toCollection() and toMap() methods allow a specific implementation of a collection and a map to be used, respectively—for example, a TreeSet for a collection and a TreeMap for a map. In addition, there is the joining() method that creates a collector for concatenating the input elements to a String—however, internally it uses a mutable StringBuilder (p. 984).

Collectors can be composed with other collectors; that is, the partial results from one collector can be additionally processed by another collector (called the downstream collector) to produce the final result. Many collectors that can be used as a downstream collector perform functional reduction such as counting values, finding the minimum and maximum values, summing values, averaging values, and summarizing common statistics for values (p. 998).

Composition of collectors is utilized to perform multilevel grouping and partitioning on stream elements (p. 985). The groupingBy() and partitionBy() methods return composed collectors to create classification maps. In such a map, the keys are determined by a classifier function, and the values are the result of a downstream collector, called the classification mapping. For example, the CDs in a stream could be classified into a map where the key represents the number of tracks on a CD and the associated value of a key can be a list of CDs with the same number of tracks. The list of CDs with the same number of tracks is the result of an appropriate downstream collector.

Collecting to a `Collection`

The method toCollection(Supplier) creates a collector that uses a mutable container of a specific Collection type to perform mutable reduction. A supplier to create the mutable container is specified as an argument to the method.

The following stream pipeline creates an ArrayList<String> instance with the titles of all CDs in the stream. The constructor reference ArrayList::new returns an empty ArrayList<String> instance, where the element type String is inferred from the context.

ArrayList<String> cdTitles1 = CD.cdList.stream() // Stream<CD>
    .map(CD::title)                              // Stream<String>
    .collect(Collectors.toCollection(ArrayList::new));
//[Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics]

static <T,C extends Collection<T>> Collector<T,?,C>
       toCollection(Supplier<C> collectionFactory)

Returns a Collector that accumulates the input elements of type T into a new Collection of type C, in encounter order. A new empty Collection of type C is created by the collectionFactory supplier, thus the collection created can be of a specific Collection type.

static <T> Collector<T,?,List<T>> toList()
static <T> Collector<T,?,List<T>> toUnmodifiableList()

Return a Collector that accumulates the input elements of type T into a new List or an unmodifiable List of type T, respectively, in encounter order.

The toList() method gives no guarantees of any kind for the returned list.

The unmodifiable list returned does not allow null values.

See also the Stream.toList() terminal operation (p. 972).

static <T> Collector<T,?,Set<T>> toSet()
static <T> Collector<T,?,Set<T>> toUnmodifiableSet()

Return an unordered Collector that accumulates the input elements of type T into a new Set or an unmodifiable Set of type T, respectively.

Collecting to a `List`

The method toList() creates a collector that uses a mutable container of type List to perform mutable reduction. This collector guarantees to preserve the encounter order of the input stream, if it has one. For more control over the type of the list, the toCollection() method can be used. This collector can be used as a downstream collector.

The following stream pipeline creates a list with the titles of all CDs in the stream using a collector returned by the Collectors.toList() method. Although the returned list is modified, this is implementation dependent and should not be relied upon.

List<String> cdTitles3 = CD.cdList.stream()      // Stream<CD>
    .map(CD::title)                              // Stream<String>
    .collect(Collectors.toList());
//[Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics]
titles.add("Java Jingles");                      // OK

Collecting to a `Set`

The method toSet() creates a collector that uses a mutable container of type Set to perform mutable reduction. The collector does not guarantee to preserve the encounter order of the input stream. For more control over the type of the set, the toCollection() method can be used.

The following stream pipeline creates a set with the titles of all CDs in the stream.

Set<String> cdTitles2 = CD.cdList.stream()       // Stream<CD>
    .map(CD::title)                              // Stream<String>
    .collect(Collectors.toSet());
//[Hot Generics, Java Jive, Lambda Dancing, Keep on Erasing, Java Jam]

Collecting to a `Map`

The method toMap() creates a collector that performs mutable reduction to a mutable container of type Map.

static <T,K,U> Collector<T,?,Map<K,U>> toMap(
       Function<? super T,? extends K> keyMapper,
       Function<? super T,? extends U> valueMapper)

static <T,K,U> Collector<T,?,Map<K,U>> toMap(
       Function<? super T,? extends K> keyMapper,
       Function<? super T,? extends U> valueMapper,
       BinaryOperator<U>               mergeFunction)

static <T,K,U,M extends Map<K,U>> Collector<T,?,M> toMap(
       Function<? super T,? extends K> keyMapper,
       Function<? super T,? extends U> valueMapper,
       BinaryOperator<U>               mergeFunction,
       Supplier<M>                     mapSupplier)

Return a Collector that accumulates elements of type T into a Map whose keys and values are the result of applying the provided key and value mapping functions to the input elements.

The keyMapper function produces keys of type K, and the valueMapper function produces values of type U.

In the first method, the mapped keys cannot have duplicates—an Illegal-StateException will be thrown if that is the case.

In the second and third methods, the mergeFunction binary operator is used to resolve collisions between values associated with the same key, as supplied to Map.merge(Object, Object, BiFunction).

In the third method, the provided mapSupplier function returns a new Map into which the results will be inserted.

The collector returned by the method toMap() uses either a default map or one that is supplied. To be able to create an entry in a Map<K,U> from stream elements of type T, the collector requires two functions:

keyMapper: T -> K, which is a Function to extract a key of type K from a stream element of type T.
valueMapper: T -> U, which is a Function to extract a value of type U for a given key of type K from a stream element of type T.

Additional functions as arguments allow various controls to be exercised on the map:

mergeFunction: (U,U) -> U, which is a BinaryOperator to merge two values that are associated with the same key. The merge function must be specified if collision of values can occur during the mutable reduction, or a resounding exception will be thrown.
mapSupplier: () -> M extends Map<K,V>, which is a Supplier that creates a map instance of a specific type to use for mutable reduction. The map created is a subtype of Map<K,V>. Without this function, the collector uses a default map.

Figure 16.15 illustrates collecting to a map. The stream pipeline creates a map of CD titles and their release year—that is, a Map<String, Year>, where K is String and V is Year. The keyMapper CD::title and the valueMapper CD::year extract the title (String) and the year (Year) from each CD in the stream, respectively. The entries are accumulated in a default map (Map<String, Year>).

Figure 16.15 Collecting to a Map

What if we wanted to create a map with CDs and their release year—that is, a Map<CD, Year>? In that case, the keyMapper should return the CD as the key—that is, map a CD to itself. That is exactly what the keyMapper Function.identity() does in the pipeline below.

Map<CD, Year> mapCDToYear = CD.cdList.stream()
    .collect(Collectors.toMap(Function.identity(), CD::year)); // Map<CD, Year>

As there were no duplicates of the key in the previous two examples, there was no collision of values in the map. In the list dupList below, there are duplicates of CDs (CD.cd0, CD.cd1). Executing the pipeline results in a runtime exception at (1).

List<CD> dupList = List.of(CD.cd0, CD.cd1, CD.cd2, CD.cd0, CD.cd1);
Map<String, Year> mapTitleToYear1 = dupList.stream()
    .collect(Collectors.toMap(CD::title, CD::year));       // (1)
// IllegalStateException: Duplicate key 2017

The collision values can be resolved by specifying a merge function. In the pipeline below, the arguments of the merge function (y1, y2) -> y1 at (1) have the same value for the year if we assume that a CD can only be released once. Note that y1 and y2 denote the existing value in the map and the value to merge, respectively. The merge function can return any one of the values to resolve the collision.

Map<String, Year> mapTitleToYear2 = dupList.stream()
    .collect(Collectors.toMap(CD::title, CD::year, (y1, y2) -> y1));       // (1)

The stream pipeline below creates a map of CD titles released each year. As more than one CD can be released in a year, collision of titles can occur for a year. The merge function (tt, t) -> tt + ":" + t concatenates the titles in each year separated by a colon, if necessary. Note that tt and t denote the existing value in the map and the value to merge, respectively.

Map<Year, String> mapTitleToYear3 = CD.cdList.stream()
    .collect(Collectors.toMap(CD::year, CD::title,
                              (tt, t) -> tt + ":" + t));
//{2017=Java Jive:Java Jam, 2018=Lambda Dancing:Keep on Erasing:Hot Generics}

The stream pipeline below creates a map with the longest title released each year. For greater control over the type of the map in which to accumulate the entries, a supplier is specified. The supplier TreeMap::new returns an empty instance of a TreeMap in which the entries are accumulated. The keys in such a map are sorted in their natural order—the class java.time.Year implements the Comparable<Year> interface.

TreeMap<Year, String> mapYearToLongestTitle = CD.cdList.stream()
    .collect(Collectors.toMap(CD::year, CD::title,
                              BinaryOperator.maxBy(Comparator.naturalOrder()),
                              TreeMap::new));
//{2017=Java Jive, 2018=Lambda Dancing}

The merge function specified is equivalent to the following lambda expression, returning the greater of two strings:

(str1, str2) -> str1.compareTo(str2) > 0 ? str1 : str2

Collecting to a `ConcurrentMap`

If the collector returned by the Collectors.toMap() method is used in a parallel stream, the multiple partial maps created during parallel execution are merged by the collector to create the final result map. Merging maps can be expensive if keys from one map are merged into another. To address the problem, the Collectors class provides the three overloaded methods toConcurrentMap(), analogous to the three toMap() methods, that return a concurrent collector—that is, a collector that uses a single concurrent map to perform the reduction. A concurrent map is thread-safe and unordered. A concurrent map implements the java.util.concurrent.Concur-rentMap interface, which is a subinterface of java.util.Map interface (§23.7, p. 1482).

Using a concurrent map avoids merging of maps during parallel execution, as a single map is created that is used concurrently to accumulate the results from the execution of each substream. However, the concurrent map is unordered—any encounter order in the stream is ignored. Usage of the toConcurrentMap() method is illustrated by the following example of a parallel stream to create a concurrent map of CD titles released each year.

ConcurrentMap<Year, String> concMapYearToTitles = CD.cdList
    .parallelStream()
    .collect(Collectors.toConcurrentMap(CD::year, CD::title,
                                        (tt, t) -> tt + ":" + t));
//{2017=Java Jam:Java Jive, 2018=Lambda Dancing:Hot Generics:Keep on Erasing}

Joining

The joining() method creates a collector for concatenating the input elements of type CharSequence to a single immutable String. However, internally it uses a mutable StringBuilder. Note that the collector returned by the joining() methods performs functional reduction, as its result is a single immutable string.

static Collector<CharSequence,?,String> joining()
static Collector<CharSequence,?,String> joining(CharSequence delimiter)
static Collector<CharSequence,?,String> joining(CharSequence delimiter,
                                                CharSequence prefix,
                                                CharSequence suffix)

Return a Collector that concatenates CharSequence elements into a String. The first method concatenates in encounter order. So does the second method, but this method separates the elements by the specified delimiter. The third method in addition applies the specified prefix and suffix to the result of the concatenation.

The wildcard ? is a type parameter that is used internally by the collector.

The methods preserve the encounter order, if the stream has one.

Among the classes that implement the CharSequence interface are the String, StringBuffer, and StringBuilder classes.

The stream pipelines below concatenate CD titles to illustrate the three overloaded joining() methods. The CharSequence elements are Strings. The strings are concatenated in the stream encounter order, which is the positional order for lists. The zero-argument joining() method at (1) performs string concatenation of the CD titles using a StringBuilder internally, and returns the result as a string.

String concatTitles1 = CD.cdList.stream()         // Stream<CD>
    .map(CD::title)                               // Stream<String>
    .collect(Collectors.joining());               // (1)
//Java JiveJava JamLambda DancingKeep on ErasingHot Generics

The single-argument joining() method at (2) concatenates the titles using the specified delimiter.

String concatTitles2 = CD.cdList.stream()
    .map(CD::title)
    .collect(Collectors.joining(", "));           // (2) Delimiter
//Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics

The three-argument joining() method at (3) concatenates the titles using the specified delimiter, prefix, and suffix.

String concatTitles3 = CD.cdList.stream()
    .map(CD::title)
    .collect(Collectors.joining(", ", "[", "]"));  // (3) Delimiter, Prefix, Suffix
//[Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics]

Grouping

Classifying elements into groups based on some criteria is a very common operation. An example is classifying CDs into groups according to the number of tracks on them (this sounds esoteric, but it will illustrate the point). Such an operation can be accomplished by the collector returned by the groupingBy() method. The method is passed a classifier function that is used to classify the elements into different groups. The result of the operation is a classification map whose entries are the different groups into which the elements have been classified. The key in a map entry is the result of applying the classifier function on the element. The key is extracted from the element based on some property of the element—for example, the number of tracks on the CD. The value associated with a key in a map entry comprises those elements that belong to the same group. The operation is analogous to the group-by operation in databases.

There are three versions of the groupingBy() method that provide increasingly more control over the grouping operation.

static <T,K> Collector<T,?,Map<K,List<T>>> groupingBy(
       Function<? super T,? extends K> classifier)

static <T,K,A,D> Collector<T,?,Map<K,D>> groupingBy(
       Function<? super T,? extends K> classifier,
       Collector<? super T,A,D>        downstream)

static <T,K,D,A,M extends Map<K,D>> Collector<T,?,M> groupingBy(
       Function<? super T,? extends K> classifier,
       Supplier<M>                     mapSupplier,
       Collector<? super T,A,D>        downstream)

The Collector returned by the groupingBy() methods implements a group-by operation on input elements to create a classification map.
The classifier function maps elements of type T to keys of some type K. These keys determine the groups in the classification map.

The collector returned by the single-argument method produces a classification map of type Map<K, List<T>>. The keys in this map are the results from applying the specified classifier function to the input elements. The input elements that map to the same key are accumulated into a List by the default downstream collector Collector.toList().

The two-argument method accepts a downstream collector, in addition to the classifier function. The collector returned by the method is composed with the specified downstream collector that performs a reduction operation on the input elements that map to the same key. It operates on elements of type T and produces a result of type D. The result of type D produced by the downstream collector is the value associated with the key of type K. The composed collector thus results in a classification map of type Map<K, D>.

The three-argument method accepts a map supplier as its second parameter. It creates an empty classification map of type M that is used by the composed collector. The result is a classification map of type M whose key and value types are K and D, respectively.

Figure 16.16 illustrates the groupingBy() operation by grouping CDs according to the number of tracks on them. The classifier function CD::noOfTracks extracts the number of tracks from a CD that acts as a key in the classification map (Map<Integer, List<CD>>). Since the call to the groupingBy() method in Figure 16.16 does not specify a downstream collector, the default downstream collector Collector.to-List() is used to accumulate CDs that have the same number of tracks. The number of groups—that is, the number of distinct keys—is equal to the number of distinct values for the number of tracks on the CDs. Each distinct value for the number of tracks is associated with the list of CDs having that value as the number of tracks.

Figure 16.16 Grouping

The three stream pipelines below result in a classification map that is equivalent to the one in Figure 16.16. The call to the groupingBy() method at (2) specifies the downstream collector explicitly, and is equivalent to the call in Figure 16.16.

Map<Integer, List<CD>> map22 = CD.cdList.stream()
    .collect(Collectors.groupingBy(CD::noOfTracks, Collectors.toList()));  // (2)

The call to the groupingBy() method at (3) specifies the supplier TreeMap:new so that a TreeMap<Integer, List<CD>> is used as the classification map.

Map<Integer, List<CD>> map33 = CD.cdList.stream()
    .collect(Collectors.groupingBy(CD::noOfTracks,                         // (3)
                                   TreeMap::new,
                                   Collectors.toList()));

The call to the groupingBy() method at (4) specifies the downstream collector Collector.toSet() that uses a set to accumulate the CDs for a group.

Map<Integer, Set<CD>> map44 = CD.cdList.stream()
    .collect(Collectors.groupingBy(CD::noOfTracks, Collectors.toSet()));   // (4)

The classification maps created by the pipelines above will contain the three entries shown below, but only the groupingBy() method call at (3) can guarantee that the entries will be sorted in a TreeMap<Integer, List<CD>> according to the natural order for the Integer keys.

{
6=[<Jaav, "Java Jam", 6, 2017, JAZZ>],
8=[<Jaav, "Java Jive", 8, 2017, POP>,
   <Genericos, "Keep on Erasing", 8, 2018, JAZZ>],
10=[<Funkies, "Lambda Dancing", 10, 2018, POP>,
    <Genericos, "Hot Generics", 10, 2018, JAZZ>]
}

In general, any collector can be passed as a downstream collector to the groupingBy() method. In the stream pipeline below, the map value in the classification map is a count of the number of CDs having the same number of tracks. The collector Collector.counting() performs a functional reduction to count the CDs having the same number of tracks (p. 998).

Map<Integer, Long> map55 = CD.cdList.stream()
    .collect(Collectors.groupingBy(CD::noOfTracks, Collectors.counting()));
//{6=1, 8=2, 10=2}

Multilevel Grouping

The downstream collector in a groupingBy() operation can be created by another groupingBy() operation, resulting in a multilevel grouping operation—also known as a multilevel classification or cascaded grouping operation. We can extend the multilevel groupingBy() operation to any number of levels by making the downstream collector be a groupingBy() operation.

The stream pipeline below creates a classification map in which the CDs are first grouped by the number of tracks in a CD at (1), and then grouped by the musical genre of a CD at (2).

Map<Integer, Map<Genre, List<CD>>> twoLevelGrp = CD.cdList.stream()
    .collect(Collectors.groupingBy(CD::noOfTracks,             // (1)
                 Collectors.groupingBy(CD::genre)));           // (2)

Printing the contents of the resulting classification map would show the following three entries, not necessarily in this order:

{
6={JAZZ=[<Jaav, "Java Jam", 6, 2017, JAZZ>]},
8={JAZZ=[<Genericos, "Keep on Erasing", 8, 2018, JAZZ>],
   POP=[<Jaav, "Java Jive", 8, 2017, POP>]},
10={JAZZ=[<Genericos, "Hot Generics", 10, 2018, JAZZ>],
    POP=[<Funkies, "Lambda Dancing", 10, 2018, POP>]}
}

The entries of the resulting classification map can also be illustrated as a two-dimensional matrix, as shown in Figure 16.16, where the CDs are first grouped into rows by the number of tracks, and then grouped into columns by the musical genre. The value of an element in the matrix is a list of CDs which have the same number of tracks (row) and the same musical genre (column).

Figure 16.17 Multilevel Grouping as a Two-Dimensional Matrix

The number of groups in the classification map returned by the above pipeline is equal to the number of distinct values for the number of tracks, as in the single-level groupingBy() operation. However, each value associated with a key in the outer classification map is now an inner classification map that is managed by the second-level groupingBy() operation. The inner classification map has the type Map<Genre, List<CD>>; in other words, the key in the inner classification map is the musical genre of the CD and the value associated with this key is a List of CDs with this musical genre. It is the second-level groupingBy() operation that is responsible for grouping each CD in the inner classification map. Since no explicit downstream collector is specified for the second-level groupingBy() operation, it uses the default downstream collector Collector.toList().

We can modify the multilevel groupingBy() operation to count the CDs that have the same musical genre and the same number of tracks by specifying an explicit downstream collector for the second-level groupingBy() operation, as shown at (3).

The collector Collectors.counting() at (3) performs a functional reduction by accumulating the count for CDs with the same number of tracks and the same musical genre in the inner classification map (p. 998).

Map<Integer, Map<Genre, Long>> twoLevelGrp2 = CD.cdList.stream()
    .collect(Collectors.groupingBy(CD::noOfTracks,
                 Collectors.groupingBy(CD::genre,
                                       Collectors.counting())));   // (3)

Printing the contents of the resulting classification map produced by this multilevel groupingBy() operation would show the following three entries, again not necessarily in this order:

{6={JAZZ=1}, 8={JAZZ=1, POP=1}, 10={JAZZ=1, POP=1}}

It is instructive to compare the entries in the resulting classification maps in the two examples illustrated here.

To truly appreciate the groupingBy() operation, the reader is highly encouraged to implement the multilevel grouping examples in an imperative style, without using the Stream API. Good luck!

Grouping to a `ConcurrentMap`

If the collector returned by the Collectors.groupingBy() method is used in a parallel stream, the partial maps created during execution are merged to create the final map—as in the case of the Collectors.toMap() method (p. 983). Merging maps can carry a performance penalty. The Collectors class provides the three groupingBy-Concurrent() overloaded methods, analogous to the three groupingBy() methods, that return a concurrent collector—that is, a collector that uses a single concurrent map to perform the reduction. The entries in such a map are unordered. A concurrent map implements the java.util.concurrent.ConcurrentMap interface (§23.7, p. 1482).

Usage of the groupingByConcurrent() method is illustrated by the following example of a parallel stream to create a concurrent map of the number of CDs that have the same number of tracks.

ConcurrentMap<Integer, Long> map66 = CD.cdList
    .parallelStream()
    .collect(Collectors.groupingByConcurrent(CD::noOfTracks,
                                             Collectors.counting()));
//{6=1, 8=2, 10=2}

Partitioning

Partitioning is a special case of grouping. The classifier function that was used for grouping is now a partitioning predicate in the partitioningBy() method. The predicate function returns the boolean value true or false. As the keys of the resulting map are determined by the classifier function, the keys are determined by the partitioning predicate in the case of partitioning. Thus the keys are always of type Boolean, implying that the classification map can have, at most, two map entries. In other words, the partitioningBy() method can only create, at most, two partitions from the input elements. The map value associated with a key in the resulting map is managed by a downstream collector, as in the case of the groupingBy() method.

There are two versions of the partitioningBy() method:

static <T> Collector<T,?,Map<Boolean,List<T>>> partitioningBy(
       Predicate<? super T>     predicate)

static <T,D,A> Collector<T,?,Map<Boolean,D>> partitioningBy(
       Predicate<? super T>     predicate,
       Collector<? super T,A,D> downstream)

The collector returned by the first method produces a classification map of type Map<Boolean, List<T>>. The keys in this map are the results from applying the partitioning predicate to the input elements. The input elements that map to the same Boolean key are accumulated into a List by the default downstream collector Collector.toList().

The second method accepts a downstream collector, in addition to the partitioning predicate. The collector returned by the method is composed with the specified downstream collector that performs a reduction operation on the input elements that map to the same key. It operates on elements of type T and produces a result of type D. The result of type D produced by the downstream collector is the value associated with the key of type Boolean. The composed collector thus results in a resulting map of type Map<Boolean, D>.

Figure 16.18 illustrates the partitioningBy() operation by partitioning CDs according to the predicate CD::isPop that determines whether a CD is a pop music CD. The result of the partitioning predicate acts as the key in the resulting map of type Map<Boolean, List<CD>>. Since the call to the partitioningBy() method in Figure 16.18 does not specify a downstream collector, the default downstream collector Collector.toList() is used to accumulate CDs that map to the same key. The resulting map has two entries or partitions: one for CDs that are pop music CDs and one for CDs that are not. The two entries of the resulting map are also shown below:

Figure 16.18 Partitioning

{false=[<Jaav, "Java Jam", 6, 2017, JAZZ>,
        <Genericos, "Keep on Erasing", 8, 2018, JAZZ>,
        <Genericos, "Hot Generics", 10, 2018, JAZZ>],
 true=[<Jaav, "Java Jive", 8, 2017, POP>,
       <Funkies, "Lambda Dancing", 10, 2018, POP>]}

The values in a partition can be obtained by calling the Map.get() method:

List<CD> popCDs = map1.get(true);
List<CD> nonPopCDs = map1.get(false);

The stream pipeline at (2) is equivalent to the one in Figure 16.18, where the downstream collector is specified explicitly.

Map<Boolean, List<CD>> map2 = CD.cdList.stream()
    .collect(Collectors.partitioningBy(CD::isPop, Collectors.toList())); // (2)

We could have composed a stream pipeline to filter the CDs that are pop music CDs and collected them into a list. We would have to compose a second pipeline to find the CDs that are not pop music CDs. However, the partitioningBy() method does both in a single operation.

Analogous to the groupingBy() method, any collector can be passed as a downstream collector to the partitioningBy() method. In the stream pipeline below, the downstream collector Collector.counting() performs a functional reduction to count the number of CDs associated with a key (p. 998).

Map<Boolean, Long> map3 = CD.cdList.stream()
    .collect(Collectors.partitioningBy(CD::isPop, Collectors.counting()));
//{false=3, true=2}

Multilevel Partitioning

Like the groupingBy() method, the partitioningBy() operation can be used in multilevel classification. The downstream collector in a partitioningBy() operation can be created by another partitioningBy() operation, resulting in a multilevel partitioning operation—also known as a cascaded partitioning operation. The downstream collector can also be a groupingBy() operation.

In the stream pipeline below, the CDs are partitioned at (1): one partition for CDs that are pop music CDs, and one for those that are not. The CDs that are associated with a key are grouped by the year in which they were released. Note that the CDs that were released in a year are accumulated into a List by the default downstream collector Collector.toList() that is employed by the groupingBy() operation at (2).

Map<Boolean, Map<Year, List<CD>>> map1 = CD.cdList.stream()
    .collect(Collectors.partitioningBy(CD::isPop,                     // (1)
                 Collectors.groupingBy(CD::year)));                   // (2)

Printing the contents of the resulting map would show the following two entries, not necessarily in this order.

{false={2017=[<Jaav, "Java Jam", 6, 2017, JAZZ>],
        2018=[<Genericos, "Keep on Erasing", 8, 2018, JAZZ>,
              <Genericos, "Hot Generics", 10, 2018, JAZZ>]},
 true={2017=[<Jaav, "Java Jive", 8, 2017, POP>],
       2018=[<Funkies, "Lambda Dancing", 10, 2018, POP>]}}

Filtering Adapter for Downstream Collectors

The filtering() method of the Collectors class encapsulates a predicate and a downstream collector to create an adapter for a filtering operation. (See also the filter() intermediate operation, p. 912.)

static <T,A,R> Collector<T,?,R> filtering(
       Predicate<? super T>     predicate,
       Collector<? super T,A,R> downstream)

Returns a Collector that applies the predicate to input elements of type T to determine which elements should be passed to the downstream collector. This downstream collector accumulates them into results of type R, where the type parameter A is the intermediate accumulation type of the downstream collector.

The following code uses the filtering() operation at (2) to group pop music CDs according to the number of tracks on them. The groupingBy() operation at (1) creates the groups based on the number of tracks on the CDs, but the filtering() operation only allows pop music CDs to pass downstream to be accumulated.

// Filtering downstream from grouping.
Map<Integer, List<CD>> grpByTracksFilterByPopCD = CD.cdList.stream()
    .collect(Collectors.groupingBy(CD::noOfTracks,                        // (1)
                 Collectors.filtering(CD::isPop, Collectors.toList())));  // (2)

Printing the contents of the resulting map would show the entries below, not necessarily in this order. Note that the output shows that there was one or more CDs with six tracks, but there were no pop music CDs. Hence the list of CDs associated with key 6 is empty.

{6=[],
 8=[<Jaav, "Java Jive", 8, 2017, POP>],
 10=[<Funkies, "Lambda Dancing", 10, 2018, POP>]}

However, if we run the same query using the filter() intermediate stream operation at (1) prior to grouping, the contents of the result map are different, as shown below.

// Filtering before grouping.
Map<Integer, List<CD>> filterByPopCDGrpByTracks = CD.cdList.stream()
    .filter(CD::isPop)                                                     // (1)
    .collect(Collectors.groupingBy(CD::noOfTracks, Collectors.toList()));

Contents of the result map show that only entries that have a non-empty list as a value are contained in the map. This is not surprising, as any non-pop music CD is discarded before grouping, so only pop music CDs are grouped.

{8=[<Jaav, "Java Jive", 8, 2017, POP>],
 10=[<Funkies, "Lambda Dancing", 10, 2018, POP>]}

There are no surprises with partitioning, regardless of whether filtering is done before or after the partitioning, as partitioning always results in a map with two entries: one for the Boolean.TRUE key and one for the Boolean.FALSE key. The code below partitions CDs released in 2018 according to whether a CD is a pop music CD or not.

// Filtering downstream from partitioning.
Map<Boolean, List<CD>> partbyPopCDsFilterByYear = CD.cdList.stream()     // (1)
    .collect(Collectors.partitioningBy(CD::isPop,
                 Collectors.filtering(cd -> cd.year().equals(Year.of(2018)),
                                      Collectors.toList()))); // (2)
// Filtering before partitioning.
Map<Boolean, List<CD>> filterByYearPartbyPopCDs = CD.cdList.stream()     // (2)
    .filter(cd -> cd.year().equals(Year.of(2018)))
    .collect(Collectors.partitioningBy(CD::isPop, Collectors.toList()));

Both queries at (1) and (2) above will result in the same entries in the result map:

{false=[<Genericos, "Keep on Erasing", 8, 2018, JAZZ>,
        <Genericos, "Hot Generics", 10, 2018, JAZZ>],
 true=[<Funkies, "Lambda Dancing", 10, 2018, POP>]}

Mapping Adapter for Downstream Collectors

The mapping() method of the Collectors class encapsulates a mapper function and a downstream collector to create an adapter for a mapping operation. (See also the map() intermediate operation, p. 921.)

static <T,U,A,R> Collector<T,?,R> mapping(
       Function<? super T,? extends U> mapper,
       Collector<? super U,A,R>        downstream)

Returns a Collector that applies the mapper function to input elements of type T and provides the mapped results of type U to the downstream collector that accumulates them into results of type R.

In other words, the method adapts a downstream collector accepting elements of type U to one accepting elements of type T by applying a mapper function to each input element before accumulation, where type parameter A is the intermediate accumulation type of the downstream collector.

The mapping() method at (1) creates an adapter that accumulates a set of CD titles in each year for a stream of CDs. The mapper function maps a CD to its title so that the downstream collector can accumulate the titles in a set.

Map<Year, Set<String>> titlesByYearInSet = CD.cdList.stream()
    .collect(Collectors.groupingBy(
        CD::year,
        Collectors.mapping(                           // (1)
            CD::title,                                // Mapper
            Collectors.toSet())));                    // Downstream collector
System.out.println(titlesByYearInSet);
// {2017=[Java Jive, Java Jam],
//  2018=[Hot Generics, Lambda Dancing, Keep on Erasing]}

The mapping() method at (2) creates an adapter that joins CD titles in each year for a stream of CDs. The mapper function maps a CD to its title so that the downstream collector can join the titles.

Map<Year, String> joinTitlesByYear = CD.cdList.stream()
    .collect(Collectors.groupingBy(
        CD::year,
        Collectors.mapping(                           // (2)
            CD::title,
            Collectors.joining(":"))));
System.out.println(joinTitlesByYear);
// {2017=Java Jive:Java Jam,
//  2018=Lambda Dancing:Keep on Erasing:Hot Generics}

The mapping() method at (3) creates an adapter that counts the number of CD tracks for each year for a stream of CDs. The mapper function maps a CD to its number of tracks so that the downstream collector can count the total number of tracks.

Map<Year, Long> TotalNumOfTracksByYear = CD.cdList.stream()
    .collect(Collectors.groupingBy(
       CD::year,
       Collectors.mapping(                            // (3)
           CD::noOfTracks,
           Collectors.counting())));
System.out.println(TotalNumOfTracksByYear);           // {2017=2, 2018=3}

Flat Mapping Adapter for Downstream Collectors

The flatMapping() method of the Collectors class encapsulates a mapper function and a downstream collector to create an adapter for a flat mapping operation. (See also the flatMap() intermediate operation, p. 924.)

static <T,U,A,R> Collector<T,?,R> flatMapping(
       Function<? super T,? extends Stream<? extends U>> mapper,
       Collector<? super U,A,R>                          downstream)

Returns a Collector that applies the specified mapper function to input elements of type T and provides the mapped results of type U to the downstream collector that accumulates them into results of type R.

That is, the method adapts a downstream collector accepting elements of type U to one accepting elements of type T by applying a flat mapping function to each input element before accumulation, where type parameter A is the intermediate accumulation type of the downstream collector.

The flat mapping function maps an input element to a mapped stream whose elements are flattened (p. 924) and passed downstream. Each mapped stream is closed after its elements have been flattened. An empty stream is substituted if the mapped stream is null.

Given the lists of CDs below, we wish to find all unique CD titles in the lists. A stream of CD lists is created at (1). Each CD list is used to create a stream of CDs whose elements are flattened into the output stream of CDs at (2). Each CD is then mapped to its title at (3), and unique CD titles are accumulated into a set at (4). (Compare this example with the one in Figure 16.9, p. 925, using the flatMap() stream operation.)

// Given lists of CDs:
List<CD> cdListA = List.of(CD.cd0, CD.cd1);
List<CD> cdListB = List.of(CD.cd0, CD.cd1, CD.cd1);

// Find unique CD titles in the given lists:
Set<String> set =
  Stream.of(cdListA, cdListB)                         // (1) Stream<List<CD>>
        .collect(Collectors.flatMapping(List::stream, // (2) Flatten to Stream<CD>
             Collectors.mapping(CD::title,            // (3) Stream<String>
                 Collectors.toSet())));               // (4) Set<String>

Set of unique CD titles in the CD lists:

[Java Jive, Java Jam]

The collectors returned by the flatMapping() method are designed to be used in multilevel grouping operations (p. 987), such as downstream from groupingBy() or partitionBy() operations. Example 16.13 illustrates such a use with the groupingBy() operation.

In Example 16.13, the class RadioPlaylist at (1) represents a radio station by its name and a list of CDs played by the radio station. Three CD lists are constructed at (2) and used to construct three radio playlists at (3). The radio playlists are stored in a common list of radio playlists at (4). A query is formulated at (5) to find the unique titles of CDs played by each radio station. Referring to the line numbers in Example 16.13:

(6) A stream of type Stream<RadioPlaylist> is created from the list radioPlaylists of type RadioPlaylist.
(7) The radio playlists are grouped according to the name of the radio station (String).
(8) Each radio playlist of type RadioPlaylist is used as the source of a stream, which is then flattened into the output stream of type Stream<CD> by the flatMapping() operation.
(9) Each CD in the stream is mapped to its title.
(10) Each unique CD title is accumulated into the result set of each radio station (Set<String>).

The query at (5) uses four collectors. The result map has the type Map<String, List<String>>. The output shows the unique titles of CDs played by each radio station.

Example 16.13 Flat mapping

import java.util.List;

// Radio station with a playlist.
public class RadioPlaylist {                                               // (1)
  private String radioStationName;
  private List<CD> Playlist;

  public RadioPlaylist(String radioStationName, List<CD> cdList) {
    this.radioStationName = radioStationName;
    this.Playlist = cdList;
  }

  public String getRadioStationName() { return this.radioStationName; }
  public List<CD> getPlaylist() { return this.Playlist; }
}

import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;

public class CollectorsFlatMapping {
  public static void main(String[] args) {
    // Some lists of CDs:                                                     (2)
    List<CD> cdList1 = List.of(CD.cd0, CD.cd1, CD.cd1, CD.cd2);
    List<CD> cdList2 = List.of(CD.cd0, CD.cd0, CD.cd3);
    List<CD> cdList3 = List.of(CD.cd0, CD.cd4);

    // Some radio playlists:                                                  (3)
    RadioPlaylist pl1 = new RadioPlaylist("Radio JVM", cdList1);
    RadioPlaylist pl2 = new RadioPlaylist("Radio JRE", cdList2);
    RadioPlaylist pl3 = new RadioPlaylist("Radio JAR", cdList3);

    // List of radio playlists:                                               (4)
    List<RadioPlaylist> radioPlaylists = List.of(pl1, pl2, pl3);

    // Map of radio station names and set of CD titles they played:           (5)
    Map<String, Set<String>> map = radioPlaylists.stream()                 // (6)
        .collect(Collectors.groupingBy(RadioPlaylist::getRadioStationName, // (7)
             Collectors.flatMapping(rpl -> rpl.getPlaylist().stream(),     // (8)
                 Collectors.mapping(CD::title,                             // (9)
                     Collectors.toSet()))));                               // (10)
    System.out.println(map);
  }
}

Output from the program (edited to fit on the page):

{Radio JAR=[Hot Generics, Java Jive],
 Radio JVM=[Java Jive, Lambda Dancing, Java Jam],
 Radio JRE=[Java Jive, Keep on Erasing]}

Finishing Adapter for Downstream Collectors

The collectingAndThen() method encapsulates a downstream collector and a finisher function to allow the result of the collector to be adapted by the finisher function.

static <T,A,R,RR> Collector<T,A,RR> collectingAndThen(
       Collector<T,A,R> downstream,
       Function<R,RR>   finisher)

Returns a Collector that performs the operation of the downstream collector on input elements of type T, followed by applying the finisher function on the result of type R produced by the downstream collector. The final result is of type RR, the result of the finisher function. In other words, the method adapts a collector to perform an additional finishing transformation.

In the call to the collectAndThen() method at (1), the collector Collectors.maxBy() at (2) produces an Optional<Integer> result that is the maximum CD by number of tracks in each group. The finisher function at (3) extracts the value from the Optional<Integer> result, if there is one; otherwise, it returns 0. The collectAndThen() method adapts the Optional<Integer> result of its argument collector to an Integer value by applying the finisher function.

Map<Year, Integer> maxTracksByYear = CD.cdList.stream()
    .collect(Collectors.groupingBy(
         CD::year,
         Collectors.collectingAndThen(                                  // (1)
             Collectors.maxBy(Comparator.comparing(CD::noOfTracks)),    // (2)
             optCD -> optCD.map(CD::noOfTracks).orElse(0)))             // (3)
     );
System.out.println(maxTracksByYear);                      // {2017=8, 2018=10}

In the call to the collectAndThen() method at (4), the collector Collectors.averaging-Double() at (5) produces a result of type Double that is the average number of tracks in each group. The finisher function at (6) maps the Double average value to a string with the specified number format.

Map<Genre, String> avgTracksByGenre = CD.cdList.stream()
    .collect(Collectors.groupingBy(
         CD::genre,
         Collectors.collectingAndThen(                                  // (4)
             Collectors.averagingDouble(CD::noOfTracks),                // (5)
             d -> String.format("%.1f", d)))                            // (6)
     );
System.out.println(avgTracksByGenre);                   // {JAZZ=8.0, POP=9.0}

Downstream Collectors for Functional Reduction

The collectors we have seen so far perform a mutable reduction to some mutable container, except for the functional reduction implemented by the joining() method (p. 984). The Collectors class also provides static factory methods that implement collectors which perform functional reduction to compute common statistics, such as summing, averaging, finding maximum and minimum values, and the like.

Like any other collector, the collectors that perform functional reduction can also be used in a standalone capacity as a parameter of the collect() method and as a downstream collector in a composed collector. However, these collectors are most useful when used as downstream collectors.

Collectors performing functional reduction have counterparts in terminal operations for streams that provide equivalent reduction operations (Table 16.8, p. 1008).

Counting

The collector created by the Collectors.counting() method performs a functional reduction to count the input elements.

static <T> Collector<T,?,Long> counting()

The collector returned counts the number of input elements of type T. If there are no elements, the result is Long.valueOf(0L). Note that the result is of type Long.

The wildcard ? represents any type, and in the method declaration, it is the type parameter for the mutable type that is accumulated by the reduction operation.

In the stream pipeline at (1), the collector Collectors.counting() is used in a standalone capacity to count the number of jazz music CDs.

Long numOfJazzCds1 = CD.cdList.stream().filter(CD::isJazz)
    .collect(Collectors.counting());                  // (1) Standalone collector
System.out.println(numOfJazzCds1);                    // 3

In the stream pipeline at (2), the collector Collectors.counting() is used as a downstream collector in a grouping operation that groups the CDs by musical genre and uses the downstream collector to count the number of CDs in each group.

Map<Genre, Long> grpByGenre = CD.cdList.stream()
    .collect(Collectors.groupingBy(
                 CD::genre,
                 Collectors.counting()));             // (2) Downstream collector
System.out.println(grpByGenre);                       // {POP=2, JAZZ=3}
System.out.println(grpByGenre.get(Genre.JAZZ));       // 3

The collector Collectors.counting() performs effectively the same functional reduction as the Stream.count() terminal operation (p. 953) at (3).

long numOfJazzCds2 = CD.cdList.stream().filter(CD::isJazz)
    .count();                                         // (3) Stream.count()
System.out.println(numOfJazzCds2);                    // 3

Finding Min/Max

The collectors created by the Collectors.maxBy() and Collectors.minBy() methods perform a functional reduction to find the maximum and minimum elements in the input elements, respectively. As there might not be any input elements, an Optional<T> is returned as the result.

static <T> Collector<T,?,Optional<T>> maxBy(Comparator<? super T> cmp)
static <T> Collector<T,?,Optional<T>> minBy(Comparator<? super T> cmp)

Return a collector that produces an Optional<T> with the maximum or minimum element of type T according to the specified Comparator, respectively.

The natural order comparator for CDs defined at (1) is used in the stream pipelines below to find the maximum CD. The collector Collectors.maxBy() is used as a standalone collector at (2), using the natural order comparator to find the maximum CD. The Optional<CD> result can be queried for the value.

Comparator<CD> natCmp = Comparator.naturalOrder(); // (1)

Optional<CD> maxCD = CD.cdList.stream()
    .collect(Collectors.maxBy(natCmp));            // (2) Standalone collector
System.out.println("Max CD: "
    + maxCD.map(CD::title).orElse("No CD."));      // Max CD: Java Jive

In the pipeline below, the CDs are grouped by musical genre, and the CDs in each group are reduced to the maximum CD by the downstream collector Collectors.maxBy() at (3). Again, the downstream collector uses the natural order comparator, and the Optional<CD> result in each group can be queried.

// Group CDs by musical genre, and max CD in each group.
Map<Genre, Optional<CD>> grpByGenre = CD.cdList.stream()
    .collect(Collectors.groupingBy(
        CD::genre,
        Collectors.maxBy(natCmp)));       // (3) Downstream collector
System.out.println(grpByGenre);
//{JAZZ=Optional[<Jaav, "Java Jam", 6, 2017, JAZZ>],
// POP=Optional[<Jaav, "Java Jive", 8, 2017, POP>]}

System.out.println("Title of max Jazz CD: "
    + grpByGenre.get(Genre.JAZZ)
                .map(CD::title)
                .orElse("No CD."));       // Title of max Jazz CD: Java Jam

The collectors created by the Collectors.maxBy() and Collectors.minBy() methods are effectively equivalent to the max() and min() terminal operations provided by the stream interfaces (p. 954), respectively. In the pipeline below, the max() terminal operation reduces the stream of CDs to the maximum CD at (4) using the natural order comparator, and the Optional<CD> result can be queried.

Optional<CD> maxCD1 = CD.cdList.stream()
    .max(natCmp);                         // (4) max() on Stream<CD>.
System.out.println("Title of max CD: "
    + maxCD1.map(CD::title)
            .orElse("No CD."));           // Title of max CD: Java Jive

Summing

The summing collectors perform a functional reduction to produce the sum of the numeric results from applying a numeric-valued function to the input elements.

static <T> Collector<T,?,NumType> summingNumType(
       ToNumTypeFunction<? super T> mapper)

Returns a collector that produces the sum of a numtype-valued function applied to the input elements. If there are no input elements, the result is zero. The result is of NumType.

NumType is Int (but it is Integer when used as a type name), Long, or Double, and the corresponding numtype is int, long, or double.

The collector returned by the Collectors.summingInt() method is used at (1) as a standalone collector to find the total number of tracks on the CDs. The mapper function CD::noOfTracks passed as an argument extracts the number of tracks from each CD on which the functional reduction is performed.

Integer sumTracks = CD.cdList.stream()
    .collect(Collectors.summingInt(CD::noOfTracks));   // (1) Standalone collector
System.out.println(sumTracks);                         // 42

In the pipeline below, the CDs are grouped by musical genre, and the number of tracks on CDs in each group summed by the downstream collector is returned by the Collectors.summingInt() method at (2).

Map<Genre, Integer> grpByGenre = CD.cdList.stream()
    .collect(Collectors.groupingBy(
         CD::genre,
         Collectors.summingInt(CD::noOfTracks)));    // (2) Downstream collector
System.out.println(grpByGenre);                      // {POP=18, JAZZ=24}
System.out.println(grpByGenre.get(Genre.JAZZ));      // 24

The collector Collectors.summingInt() performs effectively the same functional reduction at (3) as the IntStream.sum() terminal operation (p. 973).

int sumTracks2 = CD.cdList.stream()                  // (3) Stream<CD>
    .mapToInt(CD::noOfTracks)                        // IntStream
    .sum();
System.out.println(sumTracks2);                      // 42

Averaging

The averaging collectors perform a functional reduction to produce the average of the numeric results from applying a numeric-valued function to the input elements.

static <T> Collector<T,?,Double> averagingNumType(
       ToNumTypeFunction<? super T> mapper)

Returns a collector that produces the arithmetic mean of a numtype-valued function applied to the input elements. If there are no input elements, the result is zero. The result is of type Double.

NumType is Int, Long, or Double, and the corresponding numtype is int, long, or double.

The collector returned by the Collectors.averagingInt() method is used at (1) as a standalone collector to find the average number of tracks on the CDs. The mapper function CD::noOfTracks passed as an argument extracts the number of tracks from each CD on which the functional reduction is performed.

Double avgNoOfTracks1 = CD.cdList.stream()
    .collect(Collectors
        .averagingInt(CD::noOfTracks));             // (1) Standalone collector
System.out.println(avgNoOfTracks1);                 // 8.4

In the pipeline below, the CDs are grouped by musical genre, and the downstream collector Collectors.averagingInt() at (2) calculates the average number of tracks on the CDs in each group.

Map<Genre, Double> grpByGenre = CD.cdList.stream()
    .collect(Collectors.groupingBy(
       CD::genre,
       Collectors.averagingInt(CD::noOfTracks)      // (2) Downstream collector
       ));
System.out.println(grpByGenre);                     // {POP=9.0, JAZZ=8.0}
System.out.println(grpByGenre.get(Genre.JAZZ));     // 8.0

The collector created by the Collectors.averagingInt() method performs effectively the same functional reduction as the IntStream.average() terminal operation (p. 974) at (3).

OptionalDouble avgNoOfTracks2 = CD.cdList.stream()  // Stream<CD>
    .mapToInt(CD::noOfTracks)                       // IntStream
    .average();                                     // (3)
System.out.println(avgNoOfTracks2.orElse(0.0));     // 8.4

Summarizing

The summarizing collector performs a functional reduction to produce summary statistics (count, sum, min, max, average) on the numeric results of applying a numeric-valued function to the input elements.

static <T> Collector<T,?,NumTypeSummaryStatistics> summarizingNumType(
       ToNumTypeFunction<? super T> mapper)

Returns a collector that applies a numtype-valued mapper function to the input elements, and returns the summary statistics for the resulting values.

NumType is Int (but it is Integer when used as a type name), Long, or Double, and the corresponding numtype is int, long, or double.

The collector Collectors.summarizingInt() is used at (1) as a standalone collector to summarize the statistics for the number of tracks on the CDs. The mapper function CD::noOfTracks passed as an argument extracts the number of tracks from each CD on which the functional reduction is performed.

IntSummaryStatistics stats1 = CD.cdList.stream()
    .collect(
      Collectors.summarizingInt(CD::noOfTracks)      // (1) Standalone collector
     );
System.out.println(stats1);
// IntSummaryStatistics{count=5, sum=42, min=6, average=8.400000, max=10}

The IntSummaryStatistics class provides get methods to access the individual results (p. 974).

In the pipeline below, the CDs are grouped by musical genre, and the downstream collector created by the Collectors.summarizingInt() method at (2) summarizes the statistics for the number of tracks on the CDs in each group.

Map<Genre, IntSummaryStatistics> grpByGenre = CD.cdList.stream()
  .collect(Collectors.groupingBy(
     CD::genre,
     Collectors.summarizingInt(CD::noOfTracks)));    // (2) Downstream collector
System.out.println(grpByGenre);
//{POP=IntSummaryStatistics{count=2, sum=18, min=8, average=9.000000, max=10},
// JAZZ=IntSummaryStatistics{count=3, sum=24, min=6, average=8.000000, max=10}}

System.out.println(grpByGenre.get(Genre.JAZZ));   // Summary stats for Jazz CDs.
// IntSummaryStatistics{count=3, sum=24, min=6, average=8.000000, max=10}

The collector returned by the Collectors.summarizingInt() method performs effectively the same functional reduction as the IntStream.summaryStatistics() terminal operation (p. 974) at (3).

IntSummaryStatistics stats2 = CD.cdList.stream()
    .mapToInt(CD::noOfTracks)
    .summaryStatistics();                         // (3)
System.out.println(stats2);
// IntSummaryStatistics{count=5, sum=42, min=6, average=8.400000, max=10}

Reducing

Collectors that perform common statistical operations, such as counting, averaging, and so on, are special cases of functional reduction that can be implemented using the Collectors.reducing() method.

static <T> Collector<T,?,Optional<T>> reducing(BinaryOperator<T> bop)

Returns a collector that performs functional reduction, producing an Optional with the cumulative result of applying the binary operator bop on the input elements: e₁ bop e₂ bop e₃ ..., where each e_i is an input element. If there are no input elements, an empty Optional<T> is returned.

Note that the collector reduces input elements of type T to a result that is an Optional of type T.

static <T> Collector<T,?,T> reducing(T identity, BinaryOperator<T> bop)

Returns a collector that performs functional reduction, producing the cumulative result of applying the binary operator bop on the input elements: identity bop e₁ bop e₂ ..., where each e_i is an input element. The identity value is the initial value to accumulate. If there are no input elements, the identity value is returned.

Note that the collector reduces input elements of type T to a result of type T.

static <T,U> Collector<T,?,U> reducing(
       U                               identity,
       Function<? super T,? extends U> mapper,
       BinaryOperator<U>               bop)

Returns a collector that performs a map-reduce operation. It maps each input element of type T to a mapped value of type U by applying the mapper function, and performs functional reduction on the mapped values of type U by applying the binary operator bop. The identity value of type U is used as the initial value to accumulate. If the stream is empty, the identity value is returned.

Note that the collector reduces input elements of type T to a result of type U.

Collectors returned by the Collectors.reducing() methods effectively perform equivalent functional reductions as the reduce() methods of the stream interfaces. However, the three-argument method Collectors.reducing(identity, mapper, bop) performs a map-reduce operation using a mapper function and a binary operator bop, whereas the Stream.reduce(identity, accumulator, combiner) performs a reduction using an accumulator and a combiner (p. 955). The accumulator is a BiFunction<U,T,U> that accumulates a partially accumulated result of type U with an element of type T, whereas the bop is a BinaryOperator<U> that accumulates a partially accumulated result of type U with an element of type U.

The following comparators are used in the examples below:

// Comparator to compare CDs by title.
Comparator<CD> cmpByTitle = Comparator.comparing(CD::title);        // (1)
// Comparator to compare strings by their length.
Comparator<String> byLength = Comparator.comparing(String::length); // (2)

The collector returned by the Collectors.reducing() method is used as a standalone collector at (3) to find the CD with the longest title. The result of the operation is an Optional<String> as there might not be any input elements. This operation is equivalent to using the Stream.reduce() terminal operation at (4).

Optional<String> longestTitle1 = CD.cdList.stream()
    .map(CD::title)
    .collect(Collectors.reducing(
        BinaryOperator.maxBy(byLength)));            // (3) Standalone collector
System.out.println(longestTitle1.orElse("No title"));// Keep on Erasing

Optional<String> longestTitle2 = CD.cdList.stream()  // Stream<CD>
    .map(CD::title)                                  // Stream<String>
    .reduce(BinaryOperator.maxBy(byLength));         // (4) Stream.reduce(bop)

The collector returned by the one-argument Collectors.reducing() method is used as a downstream collector at (5) to find the CD with the longest title in each group classified by the year a CD was released. The collector at (5) is equivalent to the collector returned by the Collectors.maxBy(cmpByTitle) method.

Map<Year, Optional<CD>> cdWithMaxTitleByYear = CD.cdList.stream()
    .collect(Collectors.groupingBy(
         CD::year,
         Collectors.reducing(                        // (5) Downstream collector
             BinaryOperator.maxBy(cmpByTitle))
         ));
System.out.println(cdWithMaxTitleByYear);
// {2017=Optional[<Jaav, "Java Jive", 8, 2017, POP>],
//  2018=Optional[<Funkies, "Lambda Dancing", 10, 2018, POP>]}
System.out.println(cdWithMaxTitleByYear.get(Year.of(2018))
                       .map(CD::title).orElse("No title")); // Lambda Dancing

The collector returned by the three-argument Collectors.reducing() method is used as a downstream collector at (6) to find the longest title in each group classified by the year a CD was released. Note that the collector maps a CD to its title. The longest title is associated with the map value for each group classified by the year a CD was released. The collector will return an empty string (i.e., the identity value "") if there are no CDs in the stream. In comparison, the collector Collectors.mapping() at (7) also maps a CD to its title, and uses the downstream collector Collectors.maxBy(byLength) at (7) to find the longest title (p. 993). The result in this case is an Optional<String>, as there might not be any input elements.

Map<Year, String> longestTitleByYear = CD.cdList.stream()
    .collect(Collectors.groupingBy(
         CD::year,
         Collectors.reducing("", CD::title,          // (6) Downstream collector
             BinaryOperator.maxBy(byLength))
         ));
System.out.println(longestTitleByYear);  // {2017=Java Jive, 2018=Keep on Erasing}
System.out.println(longestTitleByYear.get(Year.of(2018)));      // Keep on Erasing

Map<Year, Optional<String>> longestTitleByYear2 = CD.cdList.stream()
    .collect(Collectors.groupingBy(
         CD::year,
         Collectors.mapping(CD::title,               // (7) Downstream collector
             Collectors.maxBy(byLength))
         ));
System.out.println(longestTitleByYear2);
// {2017=Optional[Java Jive], 2018=Optional[Keep on Erasing]}
System.out.println(longestTitleByYear2.get(Year.of(2018))
                       .orElse("No title."));        // Keep on Erasing

The pipeline below groups CDs according to the year they were released. For each group, the collector returned by the three-argument Collectors.reducing() method performs a map-reduce operation at (8) to map each CD to its number of tracks and accumulate the tracks in each group. This map-reduce operation is equivalent to the collector returned by the Collectors.summingInt() method at (9).

Map<Year, Integer> noOfTracksByYear = CD.cdList.stream()
    .collect(Collectors.groupingBy(
         CD::year,
         Collectors.reducing(                        // (8) Downstream collector
             0, CD::noOfTracks, Integer::sum)));
System.out.println(noOfTracksByYear);                   // {2017=14, 2018=28}
System.out.println(noOfTracksByYear.get(Year.of(2018)));// 28

Map<Year, Integer> noOfTracksByYear2 = CD.cdList.stream()
    .collect(Collectors.groupingBy(
         CD::year,
         Collectors.summingInt(CD::noOfTracks)));    // (9) Special case collector

Summary of Static Factory Methods in the `Collectors` Class

The static factory methods of the Collectors class that create collectors are summarized in Table 16.7. All methods are static generic methods, except for the overloaded joining() methods that are not generic. The keyword static is omitted, as are the type parameters of a generic method, since these type parameters are evident from the declaration of the formal parameters to the method. The type parameter declarations have also been simplified, where any bound <? super T> or <? extends T> has been replaced by <T>, without impacting the intent of a method. A reference is also provided for each method in the first column.

The last column in Table 16.7 indicates the function type of the corresponding parameter in the previous column. It is instructive to note how the functional interface parameters provide the parameterized behavior to build the collector returned by a method. For example, the method averagingDouble() returns a collector that computes the average of the stream elements. The parameter function mapper with the functional interface type ToDoubleFunction<T> converts an element of type T to a double when the collector computes the average for the stream elements.

Table 16.7 Static Methods in the `Collectors` Class

Method name (ref.)	Return type	Functional interface parameters	Function type of parameters
`averagingDouble (p. 1000)`	`Collector<T,?,Double>`	`(ToDoubleFunction<T> mapper)`	`T -> double`
`averagingInt (p. 1000)`	`Collector<T,?,Double>`	`(ToIntFunction<T> mapper)`	`T -> int`
`averagingLong (p. 1000)`	`Collector<T,?,Double>`	`(ToLongFunction<T> mapper)`	`T -> long`
`collectingAndThen (p. 997)`	`Collector<T,A,RR>`	`(Collector<T,A,R> downstream,` `Function<R,RR> finisher)`	`(T,A) -> R,` `R -> RR`
`counting (p. 998)`	`Collector<T,?,Long>`	`()`
`filtering (p. 992)`	`Collector<T,?,R>`	`(Predicate<T> predicate,` `Collector<T,A,R> downstream)`	`T -> boolean, (T,A) -> R`
`flatMapping (p. 994)`	`Collector<T,?,R>`	`(Function<T, Stream<U>> mapper,` `Collector<U,A,R> downstream)`	`T->Stream<U>, (U,A) -> R`
`groupingBy (p. 985)`	`Collector<T,?, Map<K,List<T>>>`	`(Function<T,K> classifier)`	`T -> K`
`groupingBy (p. 985)`	`Collector<T,?, Map<K,D>>`	`(Function<T,K> classifier,` `Collector<T,A,D> downstream)`	`T -> K, (T,A) -> D`
`groupingBy (p. 985)`	`Collector<T,?,Map<K,D>>`	`(Function<T,K> classifier,` `Supplier<Map<K,D>> mapSupplier,` `Collector<T,A,D> downstream)`	`T -> K, ()->Map<K,D>, (T,A)->D`
`joining (p. 984)`	`Collector <CharSequence,?,String>`	`()`
`joining (p. 984)`	`Collector <CharSequence,?,String>`	`(CharSequence delimiter)`
`joining (p. 984)`	`Collector <CharSequence,?,String>`	`(CharSequence delimiter,` `CharSequence prefix, CharSequence suffix)`
`mapping (p. 993)`	`Collector<T,?,R>`	`(Function<T,U> mapper,` `Collector<U,A,R> downstream)`	`T -> U, (U,A) -> R`
`maxBy (p. 999)`	`Collector<T,?,Optional<T>>`	`(Comparator<T> comparator)`	`(T,T) -> T`
`minBy (p. 999)`	`Collector<T,?,Optional<T>>`	`(Comparator<T> comparator)`	`(T,T) -> T`
`partitioningBy (p. 989)`	`Collector<T,?, Map<Boolean,List<T>>>`	`(Predicate<T> predicate)`	`T -> boolean`
`partitioningBy (p. 989)`	`Collector<T,?, Map<Boolean,D>>`	`(Predicate<T> predicate,` `Collector<T,A,D> downstream)`	`T -> boolean, (T,A) -> D`
`reducing (p. 1002)`	`Collector<T,?,Optional<T>>`	`(BinaryOperator<T> op)`	`(T,T) -> T`
`reducing (p. 1002)`	`Collector<T,?,T>`	`(T identity, BinaryOperator<T> op)`	`T -> T, (T,T) -> T`
`reducing (p. 1002)`	`Collector<T,?,U>`	`(U identity, Function<T,U> mapper,` `BinaryOperator<U> op)`	`U -> U, T -> U, (U,U) -> U`
`summarizingDouble (p. 1001)`	`Collector<T,?, DoubleSummaryStatistics>`	`(ToDoubleFunction<T> mapper)`	`T -> double`
`summarizingInt (p. 1001)`	`Collector<T,?, IntSummaryStatistics>`	`(ToIntFunction<T> mapper)`	`T -> int`
`summarizingLong (p. 1001)`	`Collector<T,?, LongSummaryStatistics>`	`(ToLongFunction<T> mapper)`	`T -> long`
`summingDouble (p. 978)`	`Collector<T,?,Double>`	`(ToDoubleFunction<T> mapper)`	`T -> double`
`summingInt (p. 978)`	`Collector<T,?,Integer>`	`(ToIntFunction<T> mapper)`	`T -> int`
`summingLong (p. 978)`	`Collector<T,?,Long>`	`(ToLongFunction<T> mapper)`	`T -> long`
`toCollection (p. 979)`	`Collector<T,?,C>`	`(Supplier<C> collFactory)`	`() -> C`
`toList toUnmodifiableList (p. 980)`	`Collector<T,?,List<T>>`	`()`
`toMap (p. 981)`	`Collector<T,?,Map<K,U>>`	`(Function<T,K> keyMapper,` `Function<T,U> valueMapper)`	`T -> K, T -> U`
`toMap (p. 981)`	`Collector<T,?,Map<K,U>>`	`(Function<T,K> keyMapper,` `Function<T,U> valueMapper, BinaryOperator<U> mergeFunction)`	`T -> K, T -> U, (U,U) -> U`
`toMap (p. 981)`	`Collector<T,?,Map<K,U>>`	`(Function<T,K> keyMapper,` `Function<T,U> valueMapper,` `BinaryOperator<U> mergeFunction,` `Supplier<Map<K,U>> mapSupplier)`	`T -> K, T -> U, (U,U) -> U, ()-> Map<K,U>`
`toSet toUnmodifiableSet (p. 980)`	`Collector<T,?,Set<T>>`	`()`

Table 16.8 shows a comparison of methods in the stream interfaces that perform reduction operations and static factory methods in the Collectors class that implement collectors with equivalent functionality.

Table 16.8 Method Comparison: The Stream Interfaces and the `Collectors` Class

Method names in the stream interfaces	Static factory method names in the `Collectors` class
`collect (p. 964)`	`collectingAndThen (p. 997)`
`count (p. 953)`	`counting (p. 998)`
`filter (p. 912)`	`filtering (p. 992)`
`flatMap (p. 924)`	`flatMapping (p. 994)`
`map (p. 921)`	`mapping (p. 993)`
`max (p. 954)`	`maxBy (p. 999)`
`min (p. 954)`	`minBy (p. 999)`
`reduce (p. 955)`	`reducing (p. 1002)`
`toList (p. 972)`	`toList (p. 980)`
`average (p. 972)`	`averagingInt, averagingLong, averagingDouble (p. 1001)`
`sum (p. 972)`	`summingInt, summingLong, summingDouble (p. 978)`
`summaryStatistics (p. 972)`	`summarizingInt, summarizingLong, summarizingDouble (p. 1001)`

< Back Page 8 of 10 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Privacy Notice

Overview

Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information

Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security

Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children

This site is not directed to children under the age of 13.

Marketing

Pearson may send or direct marketing communications to users, provided that

Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
Such marketing is consistent with applicable law and Pearson's legal obligations.
Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out

Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure

Pearson may disclose personal information, as follows:

As required by law.
With the consent of the individual (or their parent, if the individual is a minor)
In response to a subpoena, court order or legal process, to the extent permitted or required by law
To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
To investigate or address actual or suspected fraud or other illegal activities
To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links

This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact

Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice

We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020

Email Address