Skip to content
Apache Kafka kf serialization 4 min read

String & Byte Serdes

Every record that travels through Kafka is ultimately a pair of byte arrays — one for the key and one for the value. The serializers and deserializers (collectively serdes) you configure decide how your Java objects become those bytes and back again. Kafka ships a handful of built-in serdes for the most common primitive types, and reaching for them first — before pulling in JSON, Avro, or Protobuf — keeps your pipelines lean, fast, and free of extra dependencies when the payload is genuinely simple.

The built-in serdes

The org.apache.kafka:kafka-clients library ships ready-made serializer and deserializer classes in the org.apache.kafka.common.serialization package. They are the same classes whether you use the raw producer/consumer API or Spring for Apache Kafka, since Spring simply delegates to them.

TypeSerializerDeserializerWire format
StringStringSerializerStringDeserializerUTF-8 bytes (encoding configurable)
byte[]ByteArraySerializerByteArrayDeserializerraw bytes (pass-through)
ByteBufferByteBufferSerializerByteBufferDeserializerraw bytes
IntegerIntegerSerializerIntegerDeserializer4-byte big-endian
LongLongSerializerLongDeserializer8-byte big-endian
ShortShortSerializerShortDeserializer2-byte big-endian
FloatFloatSerializerFloatDeserializer4-byte IEEE 754
DoubleDoubleSerializerDoubleDeserializer8-byte IEEE 754
UUIDUUIDSerializerUUIDDeserializerUTF-8 text of toString()
VoidVoidSerializerVoidDeserializeralways null

For Kafka Streams the equivalents live in org.apache.kafka.common.serialization.Serdes, which bundles a matching serializer/deserializer pair: Serdes.String(), Serdes.Long(), Serdes.ByteArray(), Serdes.UUID(), and so on.

Note that UUIDSerializer writes the textual form (36 characters), not the 16 raw bytes. If you need the compact binary representation you must serialize the two long halves yourself.

Configuring the raw producer and consumer

With the plain client API you set the serdes through the key.serializer / value.serializer (producer) and key.deserializer / value.deserializer (consumer) properties. The example below keys records by a String and sends a Long event count.

import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.LongSerializer;
import org.apache.kafka.common.serialization.StringSerializer;

import java.util.Properties;

Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, LongSerializer.class.getName());

try (KafkaProducer<String, Long> producer = new KafkaProducer<>(props)) {
    producer.send(new ProducerRecord<>("page-views", "home", 42L));
    producer.flush();
}

The matching consumer just swaps in the deserializers:

import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.common.serialization.LongDeserializer;
import org.apache.kafka.common.serialization.StringDeserializer;

props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, LongDeserializer.class.getName());

Configuring serdes in Spring Boot

In a Spring Boot 3.x application the same classes are wired declaratively through application.yml. Spring’s auto-configured KafkaTemplate and listener containers pick these up automatically.

spring:
  kafka:
    bootstrap-servers: localhost:9092
    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.apache.kafka.common.serialization.StringSerializer
    consumer:
      group-id: analytics
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer

Producing a plain String value is then a one-liner:

import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.stereotype.Service;

@Service
public class EventPublisher {

    private final KafkaTemplate<String, String> kafkaTemplate;

    public EventPublisher(KafkaTemplate<String, String> kafkaTemplate) {
        this.kafkaTemplate = kafkaTemplate;
    }

    public void publish(String userId, String rawPayload) {
        kafkaTemplate.send("events", userId, rawPayload);
    }
}

Choosing the String encoding

StringSerializer and StringDeserializer default to UTF-8, but the encoding is configurable via the key.serializer.encoding, value.serializer.encoding, or the generic serializer.encoding property. Both ends must agree — a mismatch silently corrupts non-ASCII characters rather than throwing.

value.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer.encoding=UTF-8

You can confirm what is actually on a topic with the console consumer, which deserializes as String by default:

kafka-console-consumer.sh \
  --bootstrap-server localhost:9092 \
  --topic events \
  --from-beginning

Output:

{"loginAttempt":true}
home
order-123

When raw String or bytes is enough

Built-in serdes shine when the payload is naturally a primitive or when you are deliberately treating the body as opaque:

  • Keys. Partitioning keys are almost always a String, Long, or UUID. Using StringSerializer for the key while a richer serde handles the value is a very common and recommended pattern.
  • Metrics and counters. A topic carrying Long timestamps or counts needs nothing more than LongSerializer.
  • Pass-through pipelines. When a service routes, mirrors, or buffers messages without inspecting them, ByteArraySerializer avoids any deserialize/re-serialize round trip and preserves the bytes exactly.
  • Pre-encoded payloads. If an upstream system already produced compressed or encrypted bytes, treat them as byte[] and stay out of the way.

You should reach for a structured format instead once the value has multiple fields, needs to evolve over time, or must be validated by independent producers and consumers. A free-form String of JSON works for prototypes, but it carries no schema, no compatibility guarantees, and no compile-time type safety — exactly the problems that JSON, Avro, and Protobuf serdes plus a Schema Registry exist to solve.

A String containing hand-built JSON is a frequent source of production incidents: one team renames a field, deserialization on the other side keeps “working,” and data is silently dropped. If your payload is structured, use a real schema rather than StringSerializer.

Best Practices

  • Keep keys simple — prefer String, Long, or UUID serdes so partition assignment stays predictable and debuggable.
  • Pin the String encoding (UTF-8) explicitly on both producer and consumer to avoid locale-dependent corruption.
  • Use ByteArraySerializer for true pass-through routing; do not deserialize and re-serialize bytes you never inspect.
  • Remember UUIDSerializer emits 36-character text, not 16 bytes — measure the size impact before using it on high-volume keys.
  • Reserve raw String/JSON values for prototypes; graduate structured payloads to Avro or Protobuf with a Schema Registry before they reach production.
  • Always configure matching serializer and deserializer types end to end; a Long written with LongSerializer is unreadable by StringDeserializer.
Last updated June 1, 2026
Was this helpful?