Offset Commit Strategies

A Kafka consumer’s committed offset is the bookmark that tells the broker where to resume after a restart or rebalance. When you commit — relative to when you actually process a record — is the single most important factor in whether you get at-least-once, at-most-once, or duplicate-but-never-lost delivery. Getting this wrong is the classic source of silently dropped messages in production, so it pays to understand exactly what each strategy guarantees.

How committing works

A commit records the offset of the next record to read for each partition. Kafka stores these offsets in the internal __consumer_offsets topic, keyed by group ID, topic, and partition. On the next poll() after a restart or rebalance, the consumer resumes from the committed position. If no offset exists for the group, auto.offset.reset decides where to start.

There are three broad strategies: automatic commits, manual synchronous commits, and manual asynchronous commits.

Automatic commits

Setting enable.auto.commit=true (the default) makes the consumer commit the latest polled offsets in the background every auto.commit.interval.ms (default 5000 ms). The commit actually happens inside poll(), not on a separate timer thread.

enable.auto.commit=true
auto.commit.interval.ms=5000

The danger: the offset is committed based on what was returned by poll, not what was successfully processed. If your application crashes after poll() commits but before the records are handled, those records are gone forever — this is at-most-once behavior hiding behind a convenient default.

Warning: Auto-commit silently trades correctness for convenience. If you do any real work per record — writing to a database, calling a downstream service — auto-commit can lose messages on crash. Reserve it for fire-and-forget pipelines where losing a few records is acceptable.

Manual commits: the at-least-once pattern

The fix is to disable auto-commit and commit only after processing succeeds. If the consumer crashes mid-batch, the uncommitted records are redelivered on restart, giving at-least-once delivery (with the implication that downstream logic should be idempotent).

Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "orders-processor");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("enable.auto.commit", "false");

try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props)) {
    consumer.subscribe(List.of("orders"));
    while (true) {
        ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(500));
        for (ConsumerRecord<String, String> record : records) {
            process(record); // do the real work first
        }
        consumer.commitSync(); // commit only after the whole batch succeeds
    }
}

commitSync()

commitSync() blocks until the broker acknowledges the commit and retries automatically on retriable errors. It is simple and safe, but the blocking call adds latency to your poll loop and caps throughput.

commitAsync()

commitAsync() fires the commit and returns immediately, so the loop keeps polling while the commit is in flight. This is faster, but it does not retry on failure — retrying a stale offset could overwrite a newer successful commit. Use the callback to log failures.

consumer.commitAsync((offsets, exception) -> {
    if (exception != null) {
        log.warn("Async commit failed for offsets {}", offsets, exception);
    }
});

Committing specific offsets

By default the commit methods commit the offsets returned by the last poll(). To commit at a finer granularity — for example, after every record, or per partition — pass an explicit Map<TopicPartition, OffsetAndMetadata>. Remember the committed value must be the offset of the next record, so add 1 to the record’s offset.

Map<TopicPartition, OffsetAndMetadata> offsets = new HashMap<>();
for (ConsumerRecord<String, String> record : records) {
    process(record);
    offsets.put(
        new TopicPartition(record.topic(), record.partition()),
        new OffsetAndMetadata(record.offset() + 1, "processed")
    );
    consumer.commitAsync(offsets, null);
}

commitAsync during the loop, commitSync on close

The production-grade pattern combines both: use commitAsync() in the hot loop for throughput, and a final blocking commitSync() in a finally block so that the last offsets are durably committed before the consumer leaves the group. The async failures don’t matter because the sync commit catches up.

try {
    while (true) {
        ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(500));
        for (ConsumerRecord<String, String> record : records) {
            process(record);
        }
        consumer.commitAsync(); // fast path, no blocking
    }
} catch (WakeupException e) {
    // expected on shutdown
} finally {
    try {
        consumer.commitSync(); // durable final commit
    } finally {
        consumer.close();
    }
}

Output:

Processed orders 0-149 (partition 0), committing async...
Async commit ack: {orders-0=150}
Shutdown requested (WakeupException)
Final commitSync: {orders-0=151}
Consumer closed cleanly

Strategy comparison

Strategy	Delivery guarantee	Throughput	Retries	Best for
`enable.auto.commit=true`	At-most-once on crash	High	N/A	Fire-and-forget, lossy-tolerant
`commitSync()` after processing	At-least-once	Lower (blocks)	Yes, automatic	Correctness over speed
`commitAsync()` after processing	At-least-once	High	No	High throughput, idempotent consumers
Async loop + sync on close	At-least-once	High	Best effort + final sync	Most production consumers

Tip: None of these strategies give exactly-once on their own. For exactly-once across consume-transform-produce flows, use Kafka transactions with sendOffsetsToTransaction() instead of standalone commits.

Best Practices

Disable enable.auto.commit for any consumer that does real per-record work, and commit explicitly after processing.
Prefer commitAsync() in the poll loop for throughput, and always finish with a commitSync() in a finally block before closing.
Make your processing idempotent — at-least-once delivery means records can be reprocessed after a rebalance or crash.
When committing explicit offsets, always commit record.offset() + 1, the position of the next record to read.
Never retry commitAsync() blindly; a late retry can overwrite a newer offset. Log failures and let the next commit move forward.
Keep batches small enough that redelivery after a crash stays cheap, and commit at least once per poll to bound duplicate reprocessing.