The Stream API Parallelism Map/Filter/Reduce algorithm implementation in JDK Steps of the Map/Filter/Reduce algorithm: Map Changes the type of the data, does not change the number of elements. The mapping step respects the order of your objects Filter Does not change the type of the data, changes the number of elements. You may end up with no data Reduce Produces a result If we want to create an efficient implementation of the Map/Filter/Reduce algorithm, it should not duplicate any data. It should work in the same way, performance wise, as the iterative pattern. Map/Filter/Reduce algorithm implemented by the Collection API would cause duplication of the data (storing the data in an intermediate structure) The Stream API Implementation of the Map/Filter/Reduce algorithm that does not duplicate any data and that does not create any load on the CPU or memory Stream by definition is an empty object - it does not carry any kind of data Every time you call a method on a Stream that returns another Stream, it is going to be a new Stream object Difference between the Stream API and iterative approach is that in the Stream API we describe the computation and not how this computation should be conducted. This is none of my business, this is the business of the API Reduce method triggers the computation of the elements, and those elements are going to be taken one-by-one, first mapped, then filtered and computed if they pass the filtering step. Using Streams is about creating pipelines of operations. Intermediate operation (Map/Filter) Operation that creates a Stream (it does not do anything, it does not process any data) Terminal operation (Reduce) Operation that produces a result (it will trigger the computation of the elements) If you have a pattern using a Stream that does not end up with a Terminal operation, your pattern is not going to process any data. It will be useless code You are not allowed to process the same Stream twice. This is why it is completely useless to create intermittent variables to store the Streams. You should inline it The Stream API The Stream API gives you 4 interfaces: ⚪ Stream<T> - a sequence of elements supporting sequential and parallel aggregate operations Modifier and Type Method Description <R> Stream<R> map(Function<? super T,? extends R> mapper) Returns a stream consisting of the results of applying the given function to the elements of this stream. IntStream mapToInt(ToIntFunction<? super T> mapper) Returns an ⚪ IntStream consisting of the results of applying the given function to the elements of this stream. <R> Stream<R> flatMap(Function<? super T,? extends Stream<? extends R>> mapper) Returns a stream consisting of the results of replacing each element of this stream with the contents of a mapped stream produced by applying the provided mapping function to each element long count = cities.stream() (1) .flatMap(city -> city.getPeople().stream()) (2) .count(); (3) 1 3 cities 2 With 3 people each 3 Returns 9 Stream<T> filter(Predicate<? super T> predicate) Returns a stream consisting of the elements of this stream that match the given predicate. Optional<T> reduce(BinaryOperator<T> accumulator) Performs a reduction on the elements of this stream, using an associative accumulation function, and returns an 🔴 Optional<T> describing the reduced value, if any. See Reducing Data to compute Statistics. T reduce(T identity, BinaryOperator<T> accumulator) Performs a reduction on the elements of this stream, using the provided identity value and an associative accumulation function, and returns the reduced value. See Reducing Data to compute Statistics. <R, A> R collect(Collector<? super T,A,R> collector) Performs a mutable reduction operation on the elements of this stream using a ⚪ Collector<T,A,R>. See The Collectors API and Collecting Data from Streams to create Lists/Sets/Maps. Stream<T> distinct() Returns a stream consisting of the distinct elements (according to 🟢 Object#equals(Object obj)) of this stream. Stream<T> sorted() Returns a stream consisting of the elements of this stream, sorted according to natural order. long count() Returns the count of elements in this stream. Optional<T> min(Comparator<? super T> comparator) Returns the minimum element of this stream according to the provided ⚪ Comparator<T>. Optional<T> max(Comparator<? super T> comparator) Returns the maximum element of this stream according to the provided ⚪ Comparator<T>. Object[] toArray() Returns an array containing the elements of this stream. <A> A[] toArray(IntFunction<A[]> generator) Returns an array containing the elements of this stream, using the provided generator function to allocate the returned array, as well as any additional arrays that might be required for a partitioned execution or for resizing. For example toArray(String[]::new) will return String[]. ⚪ IntStream - int primitive specialization of ⚪ Stream<T> Modifier and Type Method Description OptionalInt min() Returns an 🔴 OptionalInt describing the minimum element of this stream, or an empty optional if this stream is empty. OptionalInt max() Returns an 🔴 OptionalInt describing the maximum element of this stream, or an empty optional if this stream is empty. OptionalDouble average() Returns an 🔴 OptionalDouble describing the arithmetic mean of elements of this stream, or an empty optional if this stream is empty. int sum() Returns the sum of elements in this stream. IntSummaryStatistics summaryStatistics() Returns an 🟢 IntSummaryStatistics describing various summary data about the elements of this stream. ⚪ LongStream - long primitive specialization of ⚪ Stream<T>. It has similar methods to ⚪ IntStream ⚪ DoubleStream - double primitive specialization of ⚪ Stream<T>. It has similar methods to ⚪ IntStream The Collectors API 🔴 Collectors - implementations of ⚪ Collector<T,A,R> that implement various useful reduction operations, such as accumulating elements into collections, summarizing elements according to various criteria, etc. Type parameters of ⚪ Collector<T,A,R>: T - the type of input elements to the reduction operation A - the mutable accumulation type of the reduction operation (often hidden as an implementation detail) R - the result type of the reduction operation Modifier and Type Method Description static <T> Collector<T,?,List<T>> toList() Returns a ⚪ Collector<T,A,R> that accumulates the input elements into a new ⚪ List<T>, in encounter order. static <T> Collector<T,?,List<T>> toUnmodifiableList() Returns a ⚪ Collector<T,A,R> that accumulates the input elements into an unmodifiable ⚪ List<T>, in encounter order. static <T> Collector<T,?,Set<T>> toSet() Returns a ⚪ Collector<T,A,R> that accumulates the input elements into a new ⚪ Set<T>. static <T> Collector<T,?,Set<T>> toUnmodifiableSet() Returns a ⚪ Collector<T,A,R> that accumulates the input elements into an unmodifiable ⚪ Set<T>. static <T, C extends Collection<T>> Collector<T,?,C> toCollection(Supplier<C> collectionFactory) Returns a ⚪ Collector<T,A,R> that accumulates the input elements into a new ⚪ Collection<T>, in encounter order. For example toCollection(MyCollection::new) will return 🟢 MyCollection. static Collector<CharSequence,?,String> joining() Returns a ⚪ Collector<T,A,R> that concatenates the input elements into a 🔴 String, in encounter order. The string won’t be delimited. static Collector<CharSequence,?,String> joining(CharSequence delimiter) Returns a ⚪ Collector<T,A,R> that concatenates the input elements, separated by the specified delimiter, in encounter order. static Collector<CharSequence,?,String> joining(CharSequence delimiter, CharSequence prefix, CharSequence suffix) Returns a ⚪ Collector<T,A,R> that concatenates the input elements, separated by the specified delimiter, with the specified prefix and suffix, in encounter order. static <T, K> Collector<T,?,Map<K,List<T>>> groupingBy(Function<? super T,? extends K> classifier) Returns a ⚪ Collector<T,A,R> implementing a "group by" operation on input elements of type T, grouping elements according to a classification function, and returning the results in a ⚪ Map. static <T, K, A, D> Collector<T,?,Map<K,D>> groupingBy(Function<? super T,? extends K> classifier, Collector<? super T,A,D> downstream) Returns a ⚪ Collector<T,A,R> implementing a cascaded "group by" operation on input elements of type T, grouping elements according to a classification function, and then performing a reduction operation on the values associated with a given key using the specified downstream ⚪ Collector<T,A,R>. static <T> Collector<T,?,Long> counting() Returns a ⚪ Collector<T,A,R> accepting elements of type T that counts the number of input elements. static <T> Collector<T,?,Integer> summingInt(ToIntFunction<? super T> mapper) Returns a ⚪ Collector<T,A,R> that produces the sum of an integer-valued function applied to the input elements. For example summingInt(City::getPopulation) will return population of all cities. Building a Stream from Data in Memory Creating a Stream from Arrays: 🔴 Arrays#stream(T[] array) - returns a sequential Stream<T> with the specified array as its source ⚪ Stream<T>#of(T… values) - returns a sequential ordered stream whose elements are the specified values Stream<String> streamOfStrings = Stream.<String>of("abcd", "efgh"); Creating a Stream from a Text File: 🔴 Files#lines(Path path) - read all lines from a file as a Stream<String> Path path = Path.of("src/main/resources/first-names.txt"); (1) try (Stream<String> lines = Files.lines(path)) { long count = lines.count(); (2) } catch (IOException e) { e.printStackTrace(); } 1 File with 200 lines 2 Returns 200 Creating a Stream from a RegEx: 🔴 Pattern#splitAsStream(CharSequence input) - creates a Stream<String> from the given input sequence around matches of this pattern. String sentence = "the quick brown fox jumps over the lazy dog"; (1) String[] words = sentence.split(" "); long count1 = Arrays.stream(words).count(); (3) (2) long count2 = Pattern.compile(" ").splitAsStream(sentence).count(); (3) 1 Bad Practice: we create an array and store it in memory 2 Best Practice: we do not create an array when we create a ⚪ Stream<T> from 🔴 Pattern 3 Returns 9 Creating an ⚪ IntStream (Stream of ASCII Codes) from a String: 🔴 String#chars() - Returns an ⚪ IntStream of int zero-extending the char values from this sequence. String sentence = "the quick brown fox jumps over the lazy dog"; sentence.chars() .mapToObj(codePoint -> Character.toString(codePoint)) (1) .filter(letter -> !letter.equals(" ")) .distinct().sorted().forEach(System.out::print); 1 Converts ⚪ IntStream to ⚪ Stream<String> Selecting elements of a Stream: IntStream.range(0, 30) .skip(10) (1) .limit(10) (2) .forEach(index -> System.out.print(index + " ")); (3) 1 Skip first 10 elements 2 Take next 10 elements after skip 3 Prints "10 11 12 13 14 15 16 17 18 19" Closing a ⚪ Stream<T> with a ⚪ Predicate<? super T>: takeWhile - consume elements of the ⚪ Stream<T> until ⚪ Predicate<? super T> is true Stream.<Class<?>>iterate(ArrayList.class, c -> c.getSuperclass()) .takeWhile(c -> c != null) .forEach(System.out::println); // Prints: class java.util.ArrayList class java.util.AbstractList class java.util.AbstractCollection class java.lang.Object dropWhile - consume remaining elements of the ⚪ Stream<T> after ⚪ Predicate<? super T> becomes false Stream.<Class<?>>iterate(ArrayList.class, c -> c.getSuperclass()) .dropWhile(c -> !c.equals(AbstractCollection.class)) .forEach(System.out::println); // Prints class java.util.AbstractCollection class java.lang.Object null Exception in thread "main" java.lang.NullPointerException: Cannot invoke "java.lang.Class.getSuperclass()" because "c" is null Converting a For Loop to a Stream The following: int sum = 0; int count = 0; for (Person person : people) { if (person.getAge() > 20) { count++; sum += person.getAge(); } } double average = 0d; if (count > 0) { average = sum / count; } can be converted into: double average = people.stream() .mapToInt(Person::getAge) .filter(age -> age > 20) .average().orElseThrow(); The Stream API always computes one thing. Never sacrifice the readability of your code to the performance. The performance here is measured in nanoseconds: double totalAmount = 0; int frequentRenterPoints = 0; String statement = composeHeader(); for (Rental rental : rentals) { totalAmount += computeRentalAmount(rental); frequentRenterPoints += getFrequentRenterPoints(rental); statement += computeStatementLine(rental); } statement += composeFooter(totalAmount, frequentRenterPoints); can be converted into: double totalAmount = rentals.stream() .mapToDouble(this::computeRentalAmount) .sum(); int frequentRenterPoints = rentals.stream() .mapToInt(this::getFrequentRenterPoints) .sum(); String statement = composeHeader(); statement += rentals.stream() .map(this::computeStatementLine) .collect(Collectors.joining()); statement += composeFooter(totalAmount, frequentRenterPoints); Forget about processing your data in one pass (unless you are doing an SQL request). Most of the time when you have a for loop or when you have a ⚪ Stream<T>, the JVM will optimize that for you and get rid of your for loop, get rid of your iteration, and you will have a zero pass of your data, just inline code, extremely performant, and extremely efficient. See Fine-grained optimizations provided by JIT Compiler in Java Reducing Data to compute Statistics ⚪ Stream<T>#reduce(T identity, BinaryOperator<T> accumulator) adds the Identity Element before the elements of the ⚪ Stream<T>. If you have an empty ⚪ Stream<T>, it will return Identity Element If you have only one element in the ⚪ Stream<T>, it will return the reduction of the Identity Element and this only element: Some reduction operations do not have any Identity Element (in case for the ⚪ IntStream#min(), the ⚪ IntStream#max(), the ⚪ IntStream#average() and ⚪ Stream<T>#reduce(BinaryOperator<T> accumulator). 🔴 Optional<T> are used by the Stream API, because in cases where we have an empty ⚪ Stream<T> without any Identity Element we don’t have any result. Collecting Data from Streams to create Lists/Sets/Maps Collector Complex object used to reduce a Stream. Can be used to gather data in collections and maps - it is called as reduction in a "mutable container" or mutable reduction Downstream Collector Collector that is passed to 🔴 Collectors#groupingBy(Function<? super T,? extends K> classifier, Collector<? super T,A,D> downstream) which is applied to the streaming of the list of values The Collector API Uses the ⚪ Stream#collect(Collector<? super T,A,R> collector) that takes ⚪ Collector<T,A,R> implementation as a parameter: Use 🔴 Collectors class and its factory methods You can create your own collectors, but it is complex and tricky // Bad Practice List<Person> people = new ArrayList<>(); List<Person> peopleFromNewYork = new ArrayList<>(); (1) people.stream() .filter(p -> p.getCity().equals("New York")) .forEach(p -> peopleFromNewYork.add(p)); // Best Practice List<Person> people = new ArrayList<>(); List<Person> peopleFromNewYork = people.stream() .filter(p -> p.getCity().equals("New York")) .collect(Collectors.toList()); (2) 1 Creating a ⚪ List<E> to store the result 2 Using ⚪ Collector<T,A,R> to store the result in ⚪ List<E>