Exploring Streams API Operations
The Streams API lets you perform various operations on collections and other sources efficiently. You can leverage parallelism to speed up element processing, and you can leverage lambdas and method references to minimize your code footprint, making source code easier to read. This section introduces a subset of the various operations that you can perform on stream sources.
Performing Actions on All Stream Elements
Stream<T> provides the following pair of operation methods for performing an action on each stream element:
- void forEach(Consumer<? super T> action): This terminal operation executes action on each stream element. For parallel stream pipelines, the behavior is nondeterministic; this operation doesn't guarantee to respect a stream's encounter order because doing so would sacrifice the benefit of parallelism. For any given element, the action may be performed at whatever time and in whatever thread the library chooses. If the action accesses a shared state, it's responsible for providing the required synchronization.
- void forEachOrdered(Consumer<? super T> action): This method is similar to the previous method except for respecting any defined encounter order. Performing the action for one element happens before performing the action for subsequent elements; for any given element, the action may be performed in whatever thread the library chooses.
Each method takes a java.util.function.Consumer argument named action. Consumer is an example of a predefined functional interface, and it requires a lambda to take a single argument of type T and return nothing. The following example demonstrates the difference between these methods:
List<String> birds = Arrays.asList("Robin", "Bluejay", "Penguin", "Ostrich", "Canary"); birds.stream().forEach(System.out::println); System.out.println(); birds.parallelStream().forEach(System.out::println); System.out.println(); birds.parallelStream().forEachOrdered(System.out::println);
If you were to convert this example into an application, compile the source code, and run the resulting classfile, you would observe the following output:
Robin Bluejay Penguin Ostrich Canary Penguin Canary Ostrich Robin Bluejay Robin Bluejay Penguin Ostrich Canary
The first batch of output shows the encounter order for a sequential stream. Although forEach() doesn't respect encounter order for any stream, a sequential stream is traversed by a single thread and the encounter order is preserved.
The second batch of output shows that forEach() doesn't respect encounter order for a parallel stream. Because different threads are involved, this order varies from the sequential stream output order.
To respect a parallel stream's encounter order, call forEachOrdered(). The third batch of output proves that the encounter order is respected because it's identical to the first output batch. Note that forEachOrdered() has worse performance than forEach().
Filtering Stream Elements
Occasionally you'll want to obtain a subset of a source's elements that matches some criterion. For example, you might want to obtain all SalesPerson objects describing salespeople who have generated more than 1,000 sales for the third quarter. Stream<T> provides the following operation method for filtering the stream:
Stream<T> filter(Predicate<? super T> predicate)
This intermediate operation method returns a stream consisting of the elements of this stream that match the given predicate (a Boolean-valued function). The java.util.function.Predicate type requires a lambda to take a single argument of type T and return a boolean. The following example demonstrates this method:
IntStream stream = IntStream.range(0, 20); stream.filter(x -> x%2==0).forEach(System.out::println);
This example first invokes IntStream's IntStream range(int startInclusive, int endExclusive) static factory method to return a sequential ordered stream of integers (with an increment of 1), ranging from 0 through 19. It then invokes filter() on this stream with a lambda that tells filter() to return a new stream consisting of even-numbered integers only. The resulting stream is passed to forEach(System.out::println) to output these elements. The following output is generated:
0 2 4 6 8 10 12 14 16 18
Mapping Stream Elements
A stream returns elements of a specific type, but you might need to map these elements into a new stream of equivalent elements of the same or a different type. For example, suppose you want to map a stream of Employee objects to another stream of employee name String objects. Stream<T> provides the following operation methods for mapping stream elements:
- <R> Stream<R> map(Function<? super T,? extends R> mapper): This intermediate operation returns a stream that consists of the results of applying the given mapper function to the elements of this stream. T is the element of the stream being mapped, and R is the element type of the new stream.
- DoubleStream mapToDouble(ToDoubleFunction<? super T> mapper): Returns a DoubleStream consisting of the results of applying the given function to the elements of this stream.
- IntStream mapToInt(ToIntFunction<? super T> mapper): Returns an IntStream consisting of the results of applying the given function to the elements of this stream.
- LongStream mapToLong(ToLongFunction<? super T> mapper): Returns a LongStream consisting of the results of applying the given function to the elements of this stream.
For each method, mapper must be stateless and non-interfering. To be stateless, no state must be retained from a previously seen element when processing a new element. This requirement is especially crucial to parallel streams, where a stateful mapper can lead to thread-synchronization issues because of multiple threads executing the mapper.
To be non-interfering, the source must not be modified during the stream pipeline's execution—concurrent collection sources can be modified because they are designed to handle concurrent modification. This requirement is needed for all streams because modifications to the source can lead to thrown exceptions, incorrect answers, or some other problem.
The map() method takes a java.util.function.Function argument named mapper, which requires a lambda that takes a single argument of type T and returns an object of type R. The following example demonstrates this method:
class Employee { private String name; private int age; Employee(String name, int age) { this.name = name; this.age = age; } String getName() { return name; } int getAge() { return age; } } List<Employee> employees = Arrays.asList(new Employee("John Doe", 29), new Employee("Jane Jones", 48), new Employee("Roger Price", 63), new Employee("Janet Smith", 27)); employees.stream().filter(x -> x.getAge() > 45).forEach(System.out::println); employees.stream().filter(x -> x.getAge() > 45).map(Employee::getName). forEach(System.out::println);
This example, which is presumably extracted from the main() method of a hypothetical application, first declares a local Employee class and then creates a list of Employee instances. It next obtains a pair of sequential streams to the list and filters out those Employee objects whose ages are less than or equal to 45.
The first stream expression passes each of the remaining Employee objects to forEach(), which outputs them. The second stream expression passes these objects to map(), which returns a new stream consisting of Employee name strings only. This stream is passed to forEach(), which outputs these names. The following output is generated (the hash codes might differ):
StreamsDemo$1Employee@2f92e0f4 StreamsDemo$1Employee@28a418fc Jane Jones Roger Price
The mapToDouble(), mapToInt(), and mapToLong() methods let you avoid autoboxing/unboxing and intermediate wrapper object generation by returning streams of ints, doubles, and longs. The following example shows how to use mapToInt() to return the ages of those employees over 45 as ints (instead of as Integer objects), which are subsequently output:
employees.stream().filter(x -> x.getAge() > 45).mapToInt(Employee::getAge). forEach(System.out::println);
Reduction
A reduction operation takes a sequence of input elements and combines them into a single result by repeatedly applying a combining operation, such as finding the sum or maximum of a set of numbers, or accumulating elements into a list. The Streams API offers specialized reduction operations such as summation, maximum, and count. It also offers general reduction operations known as reduce and collect.
For example, IntStream offers an int sum() terminal operation method that returns the sum of the stream's ints. We can use this method along with mapToInt() to sum the previous example's ages for those employees who are older than 45, as follows:
int sum = employees.stream().filter(x -> x.getAge() > 45). mapToInt(Employee::getAge).sum(); System.out.println(sum);
Because the only two candidate Employee objects have ages of 48 and 63, this example outputs 111.
Obtaining the combined ages of all Employee objects in the stream isn't useful. However, obtaining their average age may be of some use. IntStream provides an OptionalDouble average() method that can help you with this task.
Unlike sum(), which returns an int, average() returns java.util.OptionalDouble. This class describes container objects that may or may not contain double values. It offers a double getAsDouble() method to return the value when present. However, when this value doesn't exist, getAsDouble() throws java.util.NoSuchElementException.
The following example optimistically calls average() and getAsDouble() to return the average age (55.5), which is subsequently printed:
double avg = employees.stream().filter(x -> x.getAge() > 45). mapToInt(Employee::getAge).average().getAsDouble(); System.out.println(avg);
If the possibility of a thrown exception is a concern, before calling getAsDouble() you could call OptionalDouble's boolean isPresent() method to determine whether the value is present.
For generalized reduction, Stream<T> offers reduce() methods such as the T reduce(T identity, BinaryOperator<T> accumulator) terminal operation method that uses an identity and an accumulator to reduce a stream of elements (of type T) to a single result. identity is the initial value of the reduction and is fed into the accumulator, which takes two inputs and produces an output (the reduced value of the stream).
For example, consider the following simple sequential loop for achieving summation:
int sum = 0; for (int x: numbers) sum += x;
The reduce() method is preferable over the example's mutative accumulation because reduction is more abstract (by operating on the stream as a whole rather than on individual elements), and because a properly constructed reduce() operation is inherently parallelizable, so long as the function(s) used to process the elements are associative and stateless. For example, you could write one of the following as the equivalent of the mutative accumulation example:
int sum = numbers.stream().reduce(0, (x,y) -> x+y); int sum = numbers.stream().reduce(0, Integer::sum);
These reduction operations can run safely in parallel with almost no modification, as demonstrated here:
int sum = numbers.parallelStream().reduce(0, Integer::sum);
Continuing with generalized reduction, Stream<T> offers collect() methods such as the <R,A> R collect(Collector<? super T,A,R> collector) method that performs a mutable reduction operation on the elements of this stream using a Collector, which accumulates input elements into a mutable result container. The Collectors utility class provides Collector implementations that perform various useful reduction operations, such as accumulating elements into a collection.
Consider the following example, which accumulates employee names into a list, which is subsequently output:
List<String> names = employees.stream().map(Employee::getName). collect(Collectors.toList()); System.out.println(names);
Conclusion
The Streams API greatly improves the processing of elements from collections and other sources. This article introduced you to streams and presented various operations that you can perform on them. Because the need for brevity restrained my covering more operations, I leave you with the exercise of digging deeper into the Streams API.