HashSet
HashSet<E> is Java’s most popular implementation of the Set interface. It stores elements in no particular order, refuses duplicates, and offers constant-time performance for the most common operations — add, remove, and contains. If you need a bag of unique values and don’t care about order, HashSet is almost always the right tool.
What is HashSet?
HashSet<E> lives in java.util and is backed internally by a HashMap. Every element you add becomes a key in that map, with a constant dummy value stored alongside it. Because HashMap uses hashing to locate keys, HashSet inherits the same speed.
Key characteristics at a glance:
| Feature | Detail |
|---|---|
| Duplicates | Not allowed |
| Null elements | One null is permitted |
| Order | Unpredictable (not insertion order, not sorted) |
| Thread safety | Not synchronized |
| Average time complexity | O(1) for add / remove / contains |
Creating a HashSet
import java.util.HashSet;
public class CreateHashSet {
public static void main(String[] args) {
// Empty set (default initial capacity 16, load factor 0.75)
HashSet<String> colors = new HashSet<>();
colors.add("Red");
colors.add("Green");
colors.add("Blue");
colors.add("Red"); // duplicate — silently ignored
System.out.println(colors);
System.out.println("Size: " + colors.size());
}
}
Output:
[Red, Blue, Green]
Size: 3
Notice that the duplicate "Red" was ignored, and the printed order may differ from insertion order.
Common Operations
Checking Membership
import java.util.HashSet;
public class ContainsDemo {
public static void main(String[] args) {
HashSet<Integer> primes = new HashSet<>();
primes.add(2);
primes.add(3);
primes.add(5);
primes.add(7);
System.out.println(primes.contains(5)); // true
System.out.println(primes.contains(4)); // false
}
}
Output:
true
false
Removing Elements
import java.util.HashSet;
public class RemoveDemo {
public static void main(String[] args) {
HashSet<String> animals = new HashSet<>();
animals.add("Cat");
animals.add("Dog");
animals.add("Bird");
animals.remove("Dog");
System.out.println(animals); // [Cat, Bird] (order may vary)
animals.clear();
System.out.println(animals.isEmpty()); // true
}
}
Iterating a HashSet
Because HashSet has no index, you iterate with a for-each loop or an Iterator.
import java.util.HashSet;
import java.util.Iterator;
public class IterateDemo {
public static void main(String[] args) {
HashSet<String> langs = new HashSet<>();
langs.add("Java");
langs.add("Python");
langs.add("Kotlin");
// for-each (simplest)
for (String lang : langs) {
System.out.println(lang);
}
// Iterator (allows safe removal during traversal)
Iterator<String> it = langs.iterator();
while (it.hasNext()) {
String lang = it.next();
if (lang.equals("Python")) it.remove();
}
System.out.println(langs);
}
}
Warning: Never call
langs.remove()inside a for-each loop — it throwsConcurrentModificationException. UseIterator.remove()instead.
Set Operations (Union, Intersection, Difference)
HashSet inherits bulk-operation methods from the Collection interface that map directly to classic set algebra.
import java.util.HashSet;
public class SetOperations {
public static void main(String[] args) {
HashSet<Integer> a = new HashSet<>(java.util.Arrays.asList(1, 2, 3, 4, 5));
HashSet<Integer> b = new HashSet<>(java.util.Arrays.asList(4, 5, 6, 7));
// Union
HashSet<Integer> union = new HashSet<>(a);
union.addAll(b);
System.out.println("Union: " + union);
// Intersection
HashSet<Integer> intersection = new HashSet<>(a);
intersection.retainAll(b);
System.out.println("Intersection: " + intersection);
// Difference (a minus b)
HashSet<Integer> diff = new HashSet<>(a);
diff.removeAll(b);
System.out.println("Difference: " + diff);
}
}
Output:
Union: [1, 2, 3, 4, 5, 6, 7]
Intersection: [4, 5]
Difference: [1, 2, 3]
Null in HashSet
HashSet allows exactly one null element (mirroring the single-null rule of HashMap keys).
import java.util.HashSet;
public class NullDemo {
public static void main(String[] args) {
HashSet<String> set = new HashSet<>();
set.add(null);
set.add(null); // second null is a duplicate — ignored
set.add("Hello");
System.out.println(set); // [null, Hello]
System.out.println(set.size()); // 2
}
}
Converting Between Collections
import java.util.*;
public class ConvertDemo {
public static void main(String[] args) {
// List → HashSet (removes duplicates)
List<String> list = Arrays.asList("a", "b", "a", "c", "b");
HashSet<String> set = new HashSet<>(list);
System.out.println(set); // [a, b, c]
// HashSet → ArrayList (if you need indexed access)
List<String> back = new ArrayList<>(set);
System.out.println(back.get(0)); // some element (order not guaranteed)
}
}
Tip: Converting a
Listto aHashSetis one of the fastest ways to deduplicate a collection in Java.
Under the Hood
HashSet is literally a thin wrapper around HashMap<E, Object>:
// Simplified excerpt from OpenJDK source
public class HashSet<E> extends AbstractSet<E> {
private transient HashMap<E, Object> map;
private static final Object PRESENT = new Object();
public boolean add(E e) {
return map.put(e, PRESENT) == null;
}
}
When you call add("Red"), the element goes through these steps:
- hashCode() — Java calls
e.hashCode()to compute a raw hash. - Bit-spreading — the
HashMapmixes the hash bits to reduce clustering. - Bucket selection —
hash & (capacity - 1)picks an array slot (bucket). - equals() check — if another element already occupies the bucket,
equals()resolves the collision. If they are equal, the add is rejected (no duplicate). - Resize — when
size > capacity × loadFactor(default 0.75), the internal array doubles and all elements are rehashed.
Why this matters for custom objects: If you store your own classes in a HashSet, you must override both hashCode() and equals() consistently. If two logically equal objects return different hash codes, HashSet will store both — silently breaking the “no duplicates” contract.
import java.util.HashSet;
import java.util.Objects;
public class PersonDemo {
record Person(String name, int age) {
// Records auto-generate correct hashCode & equals — safe to use!
}
public static void main(String[] args) {
HashSet<Person> people = new HashSet<>();
people.add(new Person("Alice", 30));
people.add(new Person("Alice", 30)); // duplicate
people.add(new Person("Bob", 25));
System.out.println(people.size()); // 2
}
}
Note: Java 16+ Records automatically generate
hashCode()andequals()from all components, making them excellent keys andSetelements.
Performance Characteristics
| Operation | Average case | Worst case (many collisions) |
|---|---|---|
add(e) | O(1) | O(n) |
remove(e) | O(1) | O(n) |
contains(e) | O(1) | O(n) |
| Iteration | O(capacity + n) | same |
The worst case happens when many elements hash to the same bucket. Since Java 8, buckets degrade to a red-black tree when they exceed a threshold of 8 entries, capping the worst-case contains at O(log n) rather than O(n).
HashSet vs LinkedHashSet vs TreeSet
All three implement Set, but they differ in ordering and performance:
HashSet | LinkedHashSet | TreeSet | |
|---|---|---|---|
| Order | None | Insertion order | Sorted (natural or Comparator) |
| Performance | O(1) avg | O(1) avg | O(log n) |
| Null allowed | Yes (one) | Yes (one) | No (throws NPE) |
| Use when | Just uniqueness | Unique + insertion order | Unique + sorted |
For insertion-order preservation, reach for LinkedHashSet. For sorted order, use TreeSet.
Synchronized / Concurrent Alternatives
HashSet is not thread-safe. If multiple threads share a HashSet, protect it:
import java.util.Collections;
import java.util.HashSet;
import java.util.Set;
// Option 1: synchronized wrapper
Set<String> syncSet = Collections.synchronizedSet(new HashSet<>());
// Option 2: ConcurrentHashMap-backed set (higher throughput)
Set<String> concurrentSet = java.util.concurrent.ConcurrentHashMap.newKeySet();
See Concurrent Collections for a deeper discussion.
Related Topics
- Set Interface — the contract that
HashSet,LinkedHashSet, andTreeSetall fulfill - LinkedHashSet — a
HashSetthat remembers insertion order - TreeSet — a sorted
Setbacked by a red-black tree - Working of HashMap — the internal hashing mechanics
HashSetshares withHashMap - Comparable and Comparator — how to control element ordering in sorted sets
- Concurrent Collections — thread-safe alternatives when multiple threads share a
Set