Batching & linger.ms
Batching is the single most effective lever for Kafka producer throughput. Instead of shipping every record to the broker the instant send() is called, the producer groups records destined for the same partition into batches and sends them as one request. Larger batches amortize network and broker overhead across many records, trading a tiny amount of latency for a large gain in throughput. Tuning batch.size and linger.ms is how you dial in that trade-off for your workload.
How batching works
When you call send(), the record is not transmitted immediately. It is serialized, assigned a partition, and appended to an in-memory batch (a record accumulator) for that partition. A separate background I/O thread (the “Sender”) drains ready batches and sends them to the broker.
A batch becomes eligible to send when either of two conditions is met:
- The batch reaches
batch.sizebytes (the batch is full). linger.msmilliseconds have elapsed since the first record was added to the batch.
Whichever happens first wins. Crucially, batching is per partition — each partition has its own accumulating batch, so a topic with many partitions has many batches filling in parallel.
send(recordA) ─┐
send(recordB) ─┼──► [ Record Accumulator ]
send(recordC) ─┘ │ partition 0: [A][B][C] ──┐ (full OR linger expired)
│ partition 1: [D] │
▼ ▼
Sender thread ───────────► Broker (single produce request)
The two key knobs
| Config | Default | Meaning |
|---|---|---|
batch.size | 16384 (16 KB) | Maximum size in bytes of a single batch per partition. A batch never exceeds this; a record larger than it is sent on its own. |
linger.ms | 0 | Time the producer waits to let more records accumulate before sending a non-full batch. |
buffer.memory | 33554432 (32 MB) | Total memory available for buffering unsent records across all partitions. |
max.request.size | 1048576 (1 MB) | Upper bound on the entire produce request (may contain multiple batches). |
With the default linger.ms=0, the producer still batches whatever records are already waiting when the Sender thread is free — it just doesn’t wait deliberately. Setting linger.ms to a small positive value introduces an intentional delay so more records pile into each batch.
Increasing
linger.msadds at most that many milliseconds of latency to a record, but it does so only when traffic is light. Under high load, batches fill on size before the linger timer ever fires, so there is no added latency at all.
Throughput vs. latency trade-off
Bigger batches mean fewer, larger requests — less per-request CPU, fewer network round trips, and better compression ratios (see compression). The cost is that records may sit in the accumulator slightly longer.
The table below illustrates the typical shape of the trade-off for a moderate-throughput workload (numbers are illustrative, not guaranteed):
batch.size | linger.ms | Approx. throughput | p99 produce latency |
|---|---|---|---|
| 16 KB | 0 | ~120 MB/s | ~3 ms |
| 64 KB | 5 | ~280 MB/s | ~8 ms |
| 256 KB | 20 | ~420 MB/s | ~25 ms |
| 1 MB | 100 | ~480 MB/s | ~110 ms |
Throughput climbs steeply at first and then flattens — there is a point of diminishing returns where you are paying latency without buying much more throughput. For most high-volume pipelines, batch.size of 64–256 KB and linger.ms of 5–20 is a strong starting point.
Tuning with the plain Java client
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker1:9092,broker2:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
// Throughput-oriented batching
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 128 * 1024); // 128 KB batches
props.put(ProducerConfig.LINGER_MS_CONFIG, 10); // wait up to 10 ms
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 64 * 1024 * 1024); // 64 MB buffer
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4"); // batching + compression pair well
try (Producer<String, String> producer = new KafkaProducer<>(props)) {
for (int i = 0; i < 1_000_000; i++) {
producer.send(new ProducerRecord<>("events", "key-" + i, "payload-" + i));
}
producer.flush(); // force any lingering partial batches out
}
flush() blocks until every buffered record has been sent (or failed), regardless of linger.ms. It is the right way to drain the accumulator before shutdown or at a checkpoint boundary.
Tuning in Spring Boot
In a Spring for Apache Kafka application, set the same keys under spring.kafka.producer:
spring:
kafka:
bootstrap-servers: broker1:9092,broker2:9092
producer:
batch-size: 131072 # 128 KB
buffer-memory: 67108864 # 64 MB
compression-type: lz4
properties:
linger.ms: 10 # linger.ms has no dedicated property, set via properties
Spring exposes batch-size, buffer-memory, and compression-type as first-class properties, but linger.ms must be supplied through the generic properties map, since it has no dedicated relaxed-binding key.
You can verify the producer’s effective settings — and watch batching in action — through its metrics:
@Component
public class BatchMetricsLogger {
private final ProducerFactory<String, String> producerFactory;
public BatchMetricsLogger(ProducerFactory<String, String> producerFactory) {
this.producerFactory = producerFactory;
}
public void logBatchSize() {
producerFactory.createProducer().metrics().forEach((name, metric) -> {
if (name.name().equals("batch-size-avg")) {
System.out.printf("avg batch size = %.0f bytes%n", (double) metric.metricValue());
}
});
}
}
Output:
avg batch size = 124982 bytes
An average batch size near your batch.size means batches are filling on size (you are throughput-bound and well tuned). An average far below it means the linger timer is firing first — raise linger.ms if you want bigger batches.
Best Practices
- Start with
batch.size=64KB–256KBandlinger.ms=5–20, then measurebatch-size-avgand adjust toward the regime your latency budget allows. - Pair larger batches with compression (
lz4orzstd); compression operates per batch, so bigger batches compress better. - Ensure
buffer.memoryis large enough forbatch.size × number of active partitions, or producers will block (or throw) when the buffer fills. - Keep
batch.sizebelowmax.request.size; a single oversized batch will be rejected by the broker. - Use
flush()(or a graceful close) before shutdown so partial, lingering batches are not lost. - Treat
linger.msas nearly free under sustained load — it only adds latency when traffic is too sparse to fill batches by size. - Watch
record-queue-time-avgandrequest-latency-avgtogether; rising queue time with stable request latency means you have room to increaselinger.ms.