Skip to content
Apache Kafka kf spring 4 min read

Error Handling

When a @KafkaListener method throws an exception, Spring for Apache Kafka does not silently drop the record — it hands the failure to a container-level error handler that decides whether to retry, skip, or recover. The default strategy retries forever with a short pause, which is rarely what you want in production. DefaultErrorHandler lets you configure a precise back-off policy, classify exceptions as retryable or fatal, and route exhausted records to a recoverer such as a dead-letter topic. Getting this right is the difference between a resilient consumer and one that blocks an entire partition on a single poison message.

How error handling works

For record listeners, Spring uses a CommonErrorHandler. The out-of-the-box implementation is DefaultErrorHandler, which replaced the older SeekToCurrentErrorHandler. When your listener throws, the container catches the exception and asks the error handler what to do. DefaultErrorHandler performs seek-on-error: instead of committing the failed offset, it seeks the consumer back to the failed record (and any unprocessed records in the same batch) so they are re-delivered on the next poll. Retries therefore happen in memory within the same consumer thread — the offset is only advanced once the record succeeds or is recovered.

This seek behaviour is important: while a record is being retried, that partition makes no forward progress. A blocking retry policy keeps ordering intact but can stall the partition, which is exactly why a bounded back-off plus a recoverer matters.

Configuring back-off

The back-off controls how long the container waits between retry attempts. Spring ships two common implementations:

BackOffBehaviourTypical use
FixedBackOff(interval, maxAttempts)Constant delay, fixed number of retriesPredictable, simple transient errors
ExponentialBackOffDelay grows by a multiplier each attemptDownstream service under load
ExponentialBackOffWithMaxRetriesExponential growth capped by a retry countMost production scenarios

A FixedBackOff(2000L, 3) retries the record up to 3 times with a 2-second pause between attempts. Use FixedBackOff.UNLIMITED_ATTEMPTS to retry indefinitely (the legacy default), though that is discouraged.

import org.springframework.util.backoff.FixedBackOff;
import org.springframework.kafka.listener.DefaultErrorHandler;

// 3 retries, 2 seconds apart
DefaultErrorHandler errorHandler =
        new DefaultErrorHandler(new FixedBackOff(2000L, 3L));

For exponential back-off, prefer the max-retries variant so the consumer cannot retry forever:

import org.springframework.util.backoff.ExponentialBackOffWithMaxRetries;

ExponentialBackOffWithMaxRetries backOff = new ExponentialBackOffWithMaxRetries(5);
backOff.setInitialInterval(1000L);   // first pause: 1s
backOff.setMultiplier(2.0);          // 1s, 2s, 4s, 8s, 16s
backOff.setMaxInterval(10_000L);     // cap each pause at 10s

DefaultErrorHandler errorHandler = new DefaultErrorHandler(backOff);

Classifying retryable and non-retryable exceptions

Not every failure deserves a retry. A DeserializationException or a validation error will fail identically on every attempt — retrying just wastes time and blocks the partition. DefaultErrorHandler maintains an exception classifier you can tune.

errorHandler.addNotRetryableExceptions(
        IllegalArgumentException.class,
        org.springframework.kafka.support.serializer.DeserializationException.class);

Records that throw a non-retryable exception skip the back-off entirely and go straight to the recoverer. By default, several framework exceptions (such as DeserializationException, MessageConversionException, and MethodArgumentResolutionException) are already treated as fatal.

If you prefer an allow-list approach, switch the classifier so that only listed exceptions retry and everything else is treated as fatal:

errorHandler.setClassifications(java.util.Map.of(), false); // default: retry nothing
errorHandler.addRetryableExceptions(
        org.springframework.web.client.ResourceAccessException.class);

Tip: addNotRetryableExceptions matches subclasses too. Listing a broad parent like RuntimeException will short-circuit retries for almost everything — be specific.

The recoverer callback

Once retries are exhausted (or the exception is fatal), DefaultErrorHandler invokes its ConsumerRecordRecoverer. The default simply logs the failure and advances the offset so the record is not re-delivered. Supply your own recoverer to take real action — the most common being to forward the record to a dead-letter topic via DeadLetterPublishingRecoverer.

import org.springframework.kafka.listener.DeadLetterPublishingRecoverer;

var recoverer = new DeadLetterPublishingRecoverer(kafkaTemplate);
DefaultErrorHandler errorHandler =
        new DefaultErrorHandler(recoverer, new FixedBackOff(2000L, 3L));

The recoverer receives the failing ConsumerRecord and the Exception, so a custom one can alert, persist, or transform before giving up:

DefaultErrorHandler errorHandler = new DefaultErrorHandler(
        (record, exception) ->
                log.error("Giving up on {}-{}@{}: {}",
                        record.topic(), record.partition(),
                        record.offset(), exception.getMessage()),
        new FixedBackOff(1000L, 2L));

Wiring it into the container factory

The error handler is registered on the ConcurrentKafkaListenerContainerFactory, so every @KafkaListener using that factory inherits it.

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.kafka.config.ConcurrentKafkaListenerContainerFactory;
import org.springframework.kafka.core.ConsumerFactory;

@Configuration
public class KafkaConsumerConfig {

    @Bean
    public ConcurrentKafkaListenerContainerFactory<String, Order> kafkaListenerContainerFactory(
            ConsumerFactory<String, Order> consumerFactory,
            DefaultErrorHandler errorHandler) {

        var factory = new ConcurrentKafkaListenerContainerFactory<String, Order>();
        factory.setConsumerFactory(consumerFactory);
        factory.setCommonErrorHandler(errorHandler);
        return factory;
    }

    @Bean
    public DefaultErrorHandler errorHandler() {
        var handler = new DefaultErrorHandler(new FixedBackOff(2000L, 3L));
        handler.addNotRetryableExceptions(IllegalArgumentException.class);
        handler.setCommitRecovered(true); // commit offset after recovery
        return handler;
    }
}

With setCommitRecovered(true), the container commits the recovered offset so the poison record is not replayed after a restart. If you register a DefaultErrorHandler bean in a Spring Boot app, the auto-configuration will pick it up for the default factory automatically.

You can observe retries with a RetryListener:

errorHandler.setRetryListeners((record, ex, attempt) ->
        log.warn("Retry {} for offset {}", attempt, record.offset()));

A failing listener with the config above produces log output like this:

WARN  Retry 1 for offset 42
WARN  Retry 2 for offset 42
WARN  Retry 3 for offset 42
ERROR Giving up on orders-0@42: downstream timeout

Best Practices

  • Always bound your retries — use FixedBackOff with a finite count or ExponentialBackOffWithMaxRetries, never UNLIMITED_ATTEMPTS in production.
  • Mark deserialization and validation failures as non-retryable so a poison message does not block the partition.
  • Pair DefaultErrorHandler with a DeadLetterPublishingRecoverer so exhausted records are captured rather than lost.
  • Keep total retry time short relative to max.poll.interval.ms; long blocking back-offs can trigger a consumer rebalance.
  • Set setCommitRecovered(true) to avoid replaying recovered poison records after a restart.
  • Use a RetryListener and metrics to make retry storms visible instead of silent.
Last updated June 1, 2026
Was this helpful?