Set Interface
The Set interface models the mathematical concept of a set — a collection where every element is unique. If you try to add a duplicate, the Set simply ignores it, making it the perfect tool when you need to store distinct values or check for membership quickly.
What is the Set Interface?
Set<E> lives in java.util and extends the Collection interface. It adds exactly one semantic rule on top of Collection: no two elements may be equal (as defined by equals()). There is no get(index) method — a Set is not ordered by position.
import java.util.HashSet;
import java.util.Set;
public class SetDemo {
public static void main(String[] args) {
Set<String> colors = new HashSet<>();
colors.add("Red");
colors.add("Green");
colors.add("Blue");
colors.add("Red"); // duplicate — silently ignored
System.out.println(colors.size()); // 3, not 4
System.out.println(colors);
}
}
Output:
3
[Red, Blue, Green]
Note: The order of elements printed above is not guaranteed with
HashSet. You’ll see an arbitrary ordering that may differ between runs.
Core Set Operations
Because Set extends Collection, you get all the standard methods plus a few that shine on sets:
| Method | What it does |
|---|---|
add(E e) | Adds element; returns false if already present |
remove(Object o) | Removes element; returns false if absent |
contains(Object o) | Returns true if element is present |
size() | Number of distinct elements |
isEmpty() | true when the set has no elements |
iterator() | Iterates over each element (once) |
clear() | Removes all elements |
addAll(Collection c) | Union — adds all elements from c |
retainAll(Collection c) | Intersection — keeps only elements also in c |
removeAll(Collection c) | Difference — removes all elements found in c |
import java.util.HashSet;
import java.util.Set;
public class SetOperations {
public static void main(String[] args) {
Set<Integer> a = new HashSet<>(Set.of(1, 2, 3, 4));
Set<Integer> b = new HashSet<>(Set.of(3, 4, 5, 6));
// Union
Set<Integer> union = new HashSet<>(a);
union.addAll(b);
System.out.println("Union: " + union);
// Intersection
Set<Integer> intersection = new HashSet<>(a);
intersection.retainAll(b);
System.out.println("Intersection: " + intersection);
// Difference (a - b)
Set<Integer> diff = new HashSet<>(a);
diff.removeAll(b);
System.out.println("Difference: " + diff);
}
}
Output:
Union: [1, 2, 3, 4, 5, 6]
Intersection: [3, 4]
Difference: [1, 2]
The Three Main Implementations
Java ships three general-purpose Set implementations. Choosing the right one comes down to whether you need ordering or sorting.
| Implementation | Order | Null? | Performance | Use when… |
|---|---|---|---|---|
HashSet | None (hash-based) | One null allowed | O(1) add/contains | You only care about uniqueness |
LinkedHashSet | Insertion order | One null allowed | O(1), slight overhead | You need predictable iteration order |
TreeSet | Natural/Comparator sort | No nulls | O(log n) | You need elements sorted |
HashSet
HashSet is the default choice — fast and uses very little ceremony.
import java.util.HashSet;
public class HashSetDemo {
public static void main(String[] args) {
HashSet<String> visited = new HashSet<>();
visited.add("Paris");
visited.add("Tokyo");
visited.add("Paris"); // ignored
System.out.println("Visited " + visited.size() + " unique cities");
System.out.println(visited.contains("Tokyo")); // true
}
}
Output:
Visited 2 unique cities
true
LinkedHashSet
LinkedHashSet preserves the order in which elements were inserted — useful when iteration order matters.
import java.util.LinkedHashSet;
public class LinkedHashSetDemo {
public static void main(String[] args) {
LinkedHashSet<String> steps = new LinkedHashSet<>();
steps.add("Compile");
steps.add("Test");
steps.add("Deploy");
steps.add("Compile"); // duplicate, ignored
steps.forEach(System.out::println);
}
}
Output:
Compile
Test
Deploy
TreeSet
TreeSet keeps elements in their natural sorted order (or a custom Comparator you provide).
import java.util.TreeSet;
public class TreeSetDemo {
public static void main(String[] args) {
TreeSet<Integer> scores = new TreeSet<>();
scores.add(85);
scores.add(42);
scores.add(99);
scores.add(42); // duplicate, ignored
System.out.println(scores); // [42, 85, 99]
System.out.println(scores.first()); // 42
System.out.println(scores.last()); // 99
System.out.println(scores.headSet(85)); // elements < 85: [42]
}
}
Output:
[42, 85, 99]
42
99
[42]
Iterating Over a Set
You can iterate with a for-each loop, an Iterator, or a stream.
import java.util.Set;
import java.util.HashSet;
public class IterateSet {
public static void main(String[] args) {
Set<String> tags = new HashSet<>(Set.of("java", "backend", "oop"));
// for-each
for (String tag : tags) {
System.out.println(tag);
}
// Stream
tags.stream()
.filter(t -> t.length() > 3)
.forEach(System.out::println);
}
}
Tip: If you need to remove elements while iterating, use an
Iteratorand calliterator.remove()— removing via the for-each loop throwsConcurrentModificationException.
equals() and hashCode() Matter
A Set uses equals() to decide whether two elements are duplicates. For HashSet and LinkedHashSet it also calls hashCode() first (for bucket placement). If you store custom objects and forget to override both methods, duplicates will slip through.
import java.util.HashSet;
import java.util.Objects;
class Point {
int x, y;
Point(int x, int y) { this.x = x; this.y = y; }
@Override
public boolean equals(Object o) {
if (!(o instanceof Point p)) return false;
return x == p.x && y == p.y;
}
@Override
public int hashCode() {
return Objects.hash(x, y);
}
@Override
public String toString() { return "(" + x + "," + y + ")"; }
}
public class CustomObjectSet {
public static void main(String[] args) {
HashSet<Point> points = new HashSet<>();
points.add(new Point(1, 2));
points.add(new Point(3, 4));
points.add(new Point(1, 2)); // duplicate — correctly rejected
System.out.println(points.size()); // 2
System.out.println(points);
}
}
Output:
2
[(1,2), (3,4)]
Warning: If you override
equals()but nothashCode(), two “equal” objects may end up in different buckets and both get stored — breaking the no-duplicate guarantee inHashSet.
Set vs List
Choosing between a Set and a List comes down to what matters most:
| Concern | List | Set |
|---|---|---|
| Duplicates allowed? | Yes | No |
| Maintains insertion order? | Yes (ArrayList, LinkedList) | Only LinkedHashSet |
| Position-based access (get(i))? | Yes | No |
| Fastest membership check? | O(n) | O(1) HashSet |
| Sorted automatically? | No | TreeSet only |
Immutable Sets (Java 9+)
Since Java 9 you can create a compact, unmodifiable Set with Set.of():
import java.util.Set;
public class ImmutableSet {
public static void main(String[] args) {
Set<String> days = Set.of("Mon", "Tue", "Wed");
System.out.println(days.contains("Tue")); // true
// days.add("Thu"); // throws UnsupportedOperationException
}
}
Note:
Set.of()does not allow duplicate arguments at creation time — passing duplicates throwsIllegalArgumentExceptionimmediately.
Under the Hood
Understanding how each implementation works internally helps you make better decisions.
HashSet delegates entirely to a HashMap<E, Object> under the hood. Every element you add becomes a key in that map (with a dummy constant as the value). Bucket lookup means add, remove, and contains are all O(1) average — but worst-case O(n) if many keys hash to the same bucket (a hash collision). Java 8 improved this: once a bucket exceeds 8 entries it converts from a linked list to a balanced red-black tree, capping worst-case at O(log n).
LinkedHashSet extends HashSet but its backing LinkedHashMap maintains a doubly-linked list through all entries in insertion order. This gives you the same O(1) operations with only a small extra pointer per entry.
TreeSet is backed by a TreeMap using a red-black tree. Every operation — add, remove, contains — is O(log n), but the tree is always sorted, and you get navigational methods like floor(), ceiling(), headSet(), and tailSet() that HashSet cannot offer.
When you care about concurrent access, wrap a set with Collections.synchronizedSet() or use ConcurrentHashMap.newKeySet() — the standard Set implementations are not thread-safe.
Related Topics
- HashSet — the most common
Setimplementation, backed by a hash table - LinkedHashSet —
HashSetthat preserves insertion order - TreeSet — sorted
Setwith powerful navigational methods - Collection Interface — the root interface
Setextends - List Interface — ordered counterpart to
Setthat allows duplicates - Comparable — how
TreeSetdetermines natural ordering for custom objects