Avro Serialization
Apache Avro is the long-standing default serialization format for the Confluent Kafka ecosystem, and for good reason: it produces compact binary records, enforces a strict schema at write time, and — paired with a Schema Registry — lets producers and consumers evolve independently over the years a topic stays alive. The schema is defined once in a .avsc file rather than embedded in every record, so payloads stay small while the contract stays explicit. This page shows how to define an Avro schema, generate typed classes, and wire up KafkaAvroSerializer/KafkaAvroDeserializer against a registry.
Defining an Avro schema
An Avro schema is a JSON document, conventionally stored in a .avsc file under src/main/avro/. It declares a fully-qualified record type with strongly typed, named fields. Optional fields are modelled as a union with null, and defaults make later evolution backward-compatible.
{
"type": "record",
"name": "OrderPlaced",
"namespace": "com.devcraftly.events",
"fields": [
{ "name": "orderId", "type": "string" },
{ "name": "customerId", "type": "string" },
{ "name": "amountCents", "type": "long" },
{ "name": "currency", "type": "string", "default": "USD" },
{ "name": "placedAt", "type": { "type": "long", "logicalType": "timestamp-millis" } }
]
}
Generating Java classes
Avro ships a code generator that turns each .avsc file into a typed SpecificRecord class. With Gradle, the gradle-avro-plugin runs it automatically as part of compilation; with Maven, the avro-maven-plugin binds to the generate-sources phase.
<plugin>
<groupId>org.apache.avro</groupId>
<artifactId>avro-maven-plugin</artifactId>
<version>1.12.0</version>
<executions>
<execution>
<phase>generate-sources</phase>
<goals><goal>schema</goal></goals>
<configuration>
<sourceDirectory>${project.basedir}/src/main/avro</sourceDirectory>
</configuration>
</execution>
</executions>
</plugin>
This produces com.devcraftly.events.OrderPlaced — an immutable-ish builder-based class with typed getters — which you use directly in your producer and consumer code. Add the Confluent serializer dependency (io.confluent:kafka-avro-serializer) and the confluent Maven repository to pull in KafkaAvroSerializer.
The wire format
Avro records on Kafka are not self-describing. The serializer registers the schema with the registry once, gets back an integer ID, and prefixes the binary payload with a 5-byte header. The consumer reads the ID, fetches (and caches) the matching schema, and decodes the rest.
[ magic byte: 0x00 ][ 4-byte schema ID (big-endian) ][ Avro binary payload ]
The magic byte (0x00) marks the Confluent wire format; the schema ID lets a consumer deserialize even when the writer used a different schema version than the one it was compiled against.
Producer configuration
A Spring Boot producer sets KafkaAvroSerializer as the value serializer and points it at the registry. In production, auto.register.schemas: false forces schemas to be registered deliberately (via CI) so an incompatible change is rejected before it reaches the broker.
spring:
kafka:
bootstrap-servers: localhost:9092
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
properties:
schema.registry.url: http://localhost:8081
auto.register.schemas: false
The producer code uses the generated type directly:
@Service
public class OrderProducer {
private final KafkaTemplate<String, OrderPlaced> kafkaTemplate;
public OrderProducer(KafkaTemplate<String, OrderPlaced> kafkaTemplate) {
this.kafkaTemplate = kafkaTemplate;
}
public void publish(String orderId, String customerId, long amountCents) {
OrderPlaced event = OrderPlaced.newBuilder()
.setOrderId(orderId)
.setCustomerId(customerId)
.setAmountCents(amountCents)
.setCurrency("USD")
.setPlacedAt(Instant.now().toEpochMilli())
.build();
kafkaTemplate.send("orders", orderId, event);
}
}
Consumer configuration
The consumer uses KafkaAvroDeserializer. The critical setting is specific.avro.reader: true — without it the deserializer returns a generic GenericRecord (a dynamic map of fields) instead of your generated OrderPlaced class.
spring:
kafka:
consumer:
group-id: order-service
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
properties:
schema.registry.url: http://localhost:8081
specific.avro.reader: true
@Component
public class OrderConsumer {
@KafkaListener(topics = "orders", groupId = "order-service")
public void onOrder(OrderPlaced event) {
System.out.printf("Order %s for customer %s: %d %s%n",
event.getOrderId(), event.getCustomerId(),
event.getAmountCents(), event.getCurrency());
}
}
Output:
Order ord-9341 for customer cust-77: 4999 USD
Inspecting schemas from the CLI
You can confirm a schema is registered under the expected subject. The default subject for a topic’s value is <topic>-value under the TopicNameStrategy.
curl -s http://localhost:8081/subjects/orders-value/versions/latest | jq .
Output:
{
"subject": "orders-value",
"version": 1,
"id": 1,
"schema": "{\"type\":\"record\",\"name\":\"OrderPlaced\",...}"
}
Avro option reference
| Property | Where | Purpose |
|---|---|---|
schema.registry.url | producer + consumer | Endpoint(s) of the Schema Registry |
auto.register.schemas | producer | Register new schemas automatically (false in prod) |
use.latest.version | producer | Serialize against the latest registered schema |
specific.avro.reader | consumer | Return generated SpecificRecord instead of GenericRecord |
value.subject.name.strategy | producer | Subject naming (TopicNameStrategy, RecordNameStrategy, …) |
Always set
specific.avro.reader: trueon consumers that work with generated classes. The default (false) silently yieldsGenericRecord, which compiles fine but throwsClassCastExceptionthe moment you cast to your event type.
Best Practices
- Keep
.avscfiles in version control as the source of truth and generate classes at build time — never hand-write the generated code. - Give every optional field a
defaultand use["null", "type"]unions so later additions remain backward-compatible under the registry’s compatibility rules. - Set
auto.register.schemas: falsein production and register schemas through CI so breaking changes are caught before deployment. - Enable
specific.avro.reader: trueon consumers to get typed objects rather thanGenericRecord. - Run the Schema Registry highly available; serializers call it on startup and on every new schema version.
- Pin a deliberate per-subject compatibility mode (usually
BACKWARD) instead of relying on registry defaults.