Exactly-Once in Streams
In a distributed stream processor, a single message can be read, transformed, written, and have its offset committed across several independent steps. If the application crashes between any of those steps, you either reprocess the message (duplicates) or lose it. Kafka Streams solves this with exactly-once semantics (EOS): enabling processing.guarantee=exactly_once_v2 makes the entire read-process-write cycle atomic, so each input record affects state and output results exactly once even across failures and rebalances.
What exactly-once actually guarantees
The “exactly-once” guarantee in Kafka Streams is scoped to processing that stays inside Kafka. For a record consumed from an input topic, Streams guarantees that its effect on three things commits as a single atomic unit:
- the input offset (so the record is marked consumed),
- any state store changelog records (so local state survives failover),
- and all output records produced downstream.
Either all three commit together or none do. This eliminates duplicate output and prevents state from drifting out of sync with the offsets that produced it. It does not magically make external side effects idempotent — a REST call or JDBC insert inside a mapValues is outside the transaction and can run more than once on retry.
EOS guarantees apply to Kafka-to-Kafka pipelines. If your topology writes to an external system, you still need idempotent writes or an outbox pattern to avoid duplicates there.
How it works under the hood
Kafka Streams builds EOS on Kafka’s transactional producer and the idempotent broker protocol. Each StreamThread runs a producer with a transactional.id and wraps every commit interval in a transaction:
beginTransaction()
-> produce output records to result topics
-> produce changelog records for state stores
-> sendOffsetsToTransaction(inputOffsets, consumerGroup)
commitTransaction() // offsets + changelog + output all atomic
Offsets are committed through the transaction via sendOffsetsToTransaction, not via a normal consumer commit. Downstream consumers reading with isolation.level=read_committed only ever see records from committed transactions, so aborted work is invisible. The older exactly_once mode used one producer per input partition; exactly_once_v2 uses a single producer per thread with a fencing-aware protocol, which scales far better and is the only EOS mode you should use today.
Configuration
Set the guarantee on the Streams config. Everything else (the transactional producer, read_committed isolation, and the commit cadence) is configured automatically.
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "orders-eos");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG,
StreamsConfig.EXACTLY_ONCE_V2); // "exactly_once_v2"
props.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 100); // lower for fresher output
KafkaStreams streams = new KafkaStreams(builder.build(), props);
streams.start();
In Spring Boot, configure it through application.yml:
spring:
kafka:
streams:
application-id: orders-eos
bootstrap-servers: localhost:9092
properties:
processing.guarantee: exactly_once_v2
commit.interval.ms: 100
The application.id doubles as the consumer group and the prefix for transactional IDs, so it must be stable and unique per application.
Requirements and constraints
EOS depends on transaction coordination, which imposes a minimum broker topology and a few config rules.
| Requirement | Reason |
|---|---|
| Brokers running a recent Kafka release (KRaft mode supported) | Transaction protocol and producer fencing |
Replication factor >= 3 for __transaction_state and your topics | Transactional state must survive broker loss |
min.insync.replicas >= 2 | Avoids data loss that would break atomicity |
Single application.id per logical app | Drives transactional fencing across instances |
Downstream readers set isolation.level=read_committed | Otherwise they see aborted records |
A development single-broker cluster cannot satisfy replication factor 3; lower StreamsConfig.REPLICATION_FACTOR_CONFIG and the topic factors for local testing only.
Performance cost
Transactions add overhead: every commit interval pays for beginTransaction / commitTransaction round trips to the transaction coordinator, and the broker writes transaction markers into the partitions. The biggest lever is commit.interval.ms. With EOS, Streams defaults this to 100 ms (versus 30 s for at-least-once) to keep end-to-end latency low, but more frequent commits mean more transactions and more marker overhead. Tuning is a latency-vs-throughput tradeoff:
| Knob | Effect |
|---|---|
Lower commit.interval.ms | Lower latency, lower throughput, more transactions |
Higher commit.interval.ms | Higher throughput, higher latency, larger batches per transaction |
read_committed consumers | Read latency until the producing transaction commits |
Don’t push
commit.interval.msextremely low to “feel real-time.” Each commit is a transaction; sub-50 ms intervals can dominate broker load with marker writes and crush throughput.
In practice EOS typically costs a modest throughput reduction compared with at-least-once, in exchange for eliminating duplicates entirely — almost always the right trade for financial, billing, and dedup-sensitive pipelines.
Best practices
- Always use
exactly_once_v2; the legacyexactly_oncemode is deprecated and does not scale. - Provision
__transaction_stateand all topics with replication factor >= 3 andmin.insync.replicas=2before enabling EOS in production. - Set every downstream consumer (including non-Streams apps) to
isolation.level=read_committed, or they will read uncommitted output. - Keep external side effects out of the topology, or make them idempotent — EOS covers Kafka, not your database or HTTP calls.
- Tune
commit.interval.msagainst your real latency SLA; start at the 100 ms default and raise it if throughput matters more than freshness. - Keep
application.idstable across deployments so producer fencing and offset ownership work correctly during rebalances.