Navigation

Java java8 6 min read

Collectors

The Collectors class is the toolkit that powers the terminal .collect() operation on Java Streams. It gives you ready-made strategies for accumulating stream elements into lists, sets, maps, strings, summaries, and more — all without writing a single loop.

What is a Collector?

When a stream pipeline finishes its work, you need somewhere to put the results. The .collect() terminal operation accepts a Collector<T, A, R> — a recipe that describes how to:

Create a mutable container (e.g., a new ArrayList)
Accumulate each element into that container
Optionally combine containers (for parallel streams)
Optionally apply a final transformation

The java.util.stream.Collectors utility class ships with dozens of pre-built collectors so you almost never have to implement one yourself.

import java.util.List;
import java.util.stream.Collectors;

public class BasicCollect {
    public static void main(String[] args) {
        List<String> names = List.of("Alice", "Bob", "Charlie", "Anna");

        // Collect elements that start with 'A' into a new List
        List<String> aNames = names.stream()
                .filter(n -> n.startsWith("A"))
                .collect(Collectors.toList());

        System.out.println(aNames);
    }
}

Output:

[Alice, Anna]

Collecting to Basic Containers

`toList()`, `toSet()`, `toUnmodifiableList()`

These are the simplest collectors — gather elements into a standard Java collection.

import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;

public class ToContainers {
    public static void main(String[] args) {
        List<Integer> numbers = List.of(3, 1, 4, 1, 5, 9, 2, 6, 5);

        // Mutable list (order preserved)
        List<Integer> asList = numbers.stream().collect(Collectors.toList());

        // Set — removes duplicates, order not guaranteed
        Set<Integer> asSet = numbers.stream().collect(Collectors.toSet());

        // Unmodifiable list (Java 10+)
        List<Integer> unmodifiable = numbers.stream()
                .collect(Collectors.toUnmodifiableList());

        System.out.println("List: " + asList);
        System.out.println("Set:  " + asSet);
    }
}

Output:

List: [3, 1, 4, 1, 5, 9, 2, 6, 5]
Set:  [1, 2, 3, 4, 5, 6, 9]

Tip: In Java 16+, you can also write .toList() directly on the stream — stream.toList() — which returns an unmodifiable list without importing Collectors. It’s slightly more concise than Collectors.toUnmodifiableList().

`joining()` — Concatenating Strings

Collectors.joining() is specifically designed for Stream<String> and offers three overloads: plain concatenation, with a delimiter, or with delimiter + prefix + suffix.

import java.util.List;
import java.util.stream.Collectors;

public class JoiningExample {
    public static void main(String[] args) {
        List<String> fruits = List.of("Apple", "Banana", "Cherry");

        String plain      = fruits.stream().collect(Collectors.joining());
        String csv        = fruits.stream().collect(Collectors.joining(", "));
        String bracketed  = fruits.stream().collect(Collectors.joining(", ", "[", "]"));

        System.out.println(plain);
        System.out.println(csv);
        System.out.println(bracketed);
    }
}

Output:

AppleBananaCherry
Apple, Banana, Cherry
[Apple, Banana, Cherry]

Collecting to Maps

`toMap()`

toMap() takes a key-mapper and a value-mapper function. You must handle duplicate keys or you’ll get an IllegalStateException.

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class ToMapExample {
    public static void main(String[] args) {
        List<String> words = List.of("hello", "world", "java");

        // Map each word to its length
        Map<String, Integer> wordLengths = words.stream()
                .collect(Collectors.toMap(
                        w -> w,          // key: the word itself
                        String::length   // value: its length
                ));

        System.out.println(wordLengths);
    }
}

Output:

{hello=5, world=5, java=4}

Warning: If two elements produce the same key and you don’t provide a merge function, toMap() throws IllegalStateException. Add a third argument (existing, replacement) -> existing to keep the first value, or (e, r) -> r to keep the last.

Grouping and Partitioning

These two collectors are among the most powerful in the API, letting you split a stream into multiple buckets.

`groupingBy()`

groupingBy() groups elements by a classifier function, producing a Map<K, List<V>>.

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class GroupingExample {
    record Person(String name, String city) {}

    public static void main(String[] args) {
        List<Person> people = List.of(
                new Person("Alice", "London"),
                new Person("Bob",   "Paris"),
                new Person("Carol", "London"),
                new Person("Dave",  "Paris"),
                new Person("Eve",   "Berlin")
        );

        Map<String, List<Person>> byCity = people.stream()
                .collect(Collectors.groupingBy(Person::city));

        byCity.forEach((city, residents) ->
                System.out.println(city + ": " + residents.stream()
                        .map(Person::name)
                        .collect(Collectors.joining(", "))));
    }
}

Output:

London: Alice, Carol
Paris: Bob, Dave
Berlin: Eve

`partitioningBy()`

partitioningBy() is a special case that splits elements into exactly two groups — true and false — based on a predicate. It always returns a Map<Boolean, List<T>>.

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class PartitionExample {
    public static void main(String[] args) {
        List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

        Map<Boolean, List<Integer>> evenOdd = numbers.stream()
                .collect(Collectors.partitioningBy(n -> n % 2 == 0));

        System.out.println("Even: " + evenOdd.get(true));
        System.out.println("Odd:  " + evenOdd.get(false));
    }
}

Output:

Even: [2, 4, 6, 8, 10]
Odd:  [1, 3, 5, 7, 9]

Downstream Collectors

Both groupingBy() and partitioningBy() accept a second downstream collector that processes each group further. This is where the real power emerges.

import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class DownstreamExample {
    record Product(String category, double price) {}

    public static void main(String[] args) {
        List<Product> products = List.of(
                new Product("Electronics", 299.99),
                new Product("Electronics", 149.50),
                new Product("Books",        19.99),
                new Product("Books",        34.50),
                new Product("Clothing",     59.99)
        );

        // Group by category, then count products per category
        Map<String, Long> countByCategory = products.stream()
                .collect(Collectors.groupingBy(Product::category, Collectors.counting()));

        // Group by category, then sum prices per category
        Map<String, Double> totalByCategory = products.stream()
                .collect(Collectors.groupingBy(Product::category,
                        Collectors.summingDouble(Product::price)));

        System.out.println("Counts: " + countByCategory);
        System.out.println("Totals: " + totalByCategory);
    }
}

Output:

Counts: {Books=2, Clothing=1, Electronics=2}
Totals: {Books=54.49, Clothing=59.99, Electronics=449.49}

Summarizing Statistics

When you need count, sum, min, max, and average all at once, summarizingInt/Long/Double() returns a IntSummaryStatistics (or Long/Double variant).

import java.util.IntSummaryStatistics;
import java.util.List;
import java.util.stream.Collectors;

public class StatsExample {
    public static void main(String[] args) {
        List<Integer> scores = List.of(72, 88, 55, 91, 64, 78);

        IntSummaryStatistics stats = scores.stream()
                .collect(Collectors.summarizingInt(Integer::intValue));

        System.out.println("Count: " + stats.getCount());
        System.out.println("Sum:   " + stats.getSum());
        System.out.println("Min:   " + stats.getMin());
        System.out.println("Max:   " + stats.getMax());
        System.out.printf ("Avg:   %.2f%n", stats.getAverage());
    }
}

Output:

Count: 6
Sum:   448
Min:   55
Max:   91
Avg:   74.67

Quick Reference Table

Collector	Returns	Use case
`toList()`	`List<T>`	Ordered, duplicates kept
`toSet()`	`Set<T>`	Unique elements
`toUnmodifiableList()`	`List<T>`	Read-only list
`joining(delim, prefix, suffix)`	`String`	Concatenate strings
`toMap(k, v)`	`Map<K,V>`	Key-value pairs
`groupingBy(fn)`	`Map<K, List<V>>`	Multi-bucket grouping
`partitioningBy(pred)`	`Map<Boolean, List<T>>`	True/false split
`counting()`	`Long`	Count elements (downstream)
`summingInt/Long/Double(fn)`	numeric	Sum a numeric field
`averagingInt/Long/Double(fn)`	`Double`	Average a numeric field
`summarizingInt/Long/Double(fn)`	`*SummaryStatistics`	All stats at once
`minBy(comparator)` / `maxBy(comparator)`	`Optional<T>`	Min or max element

Under the Hood

A Collector is defined by four functions, captured in the Collector<T, A, R> interface:

supplier() — creates a new mutable container (e.g., () -> new ArrayList<>())
accumulator() — folds one element into the container (e.g., (list, e) -> list.add(e))
combiner() — merges two containers (only called in parallel streams)
finisher() — transforms the container to the final result (e.g., wrapping in Collections.unmodifiableList())

For sequential streams the combiner is never invoked, which is why most custom collectors only need the first three. When you use Collectors.toList(), the JDK’s implementation uses an ArrayList as the intermediate container and returns it directly — the finisher is the identity function.

Parallel streams split the source into chunks, each chunk is accumulated into its own sub-container, and then the combiner merges all sub-containers together. This means your custom collector’s combiner must be associative for correct parallel results.

Note: Most built-in collectors carry the UNORDERED and/or CONCURRENT characteristics as hints to the stream infrastructure. For example, toSet() is UNORDERED, allowing a parallel stream to skip the merge step and accumulate directly into one shared HashSet.

Stream API — the pipeline that feeds elements into every collector
Stream Operations (filter/map/reduce) — the intermediate steps that shape data before collection
Lambda Expressions — the concise syntax used to write classifier and mapper functions
Functional Interfaces — Function, Predicate, and BinaryOperator underpin collector arguments
Optional — returned by minBy() / maxBy() and other collectors that may produce no result
ArrayList — the default backing container for toList() and groupingBy() results

Collectors

What is a Collector?

Collecting to Basic Containers

toList(), toSet(), toUnmodifiableList()

joining() — Concatenating Strings

Collecting to Maps

toMap()

Grouping and Partitioning

groupingBy()

partitioningBy()

Downstream Collectors

Summarizing Statistics

Quick Reference Table

Under the Hood

Related Topics

`toList()`, `toSet()`, `toUnmodifiableList()`

`joining()` — Concatenating Strings

`toMap()`

`groupingBy()`

`partitioningBy()`