Kafka Architecture Overview

Apache Kafka is a distributed event streaming platform built around a deceptively simple idea: an append-only, partitioned, replicated commit log. Once you understand how the handful of moving parts fit together — brokers, the controller, topics, partitions, replication, producers, and consumer groups — almost everything else in Kafka becomes a variation on a theme. This page gives you the full map before you dive into the deep-dive pages, so you always know where each component sits and how data flows through the system in production.

The big picture

At the highest level, a Kafka deployment is a cluster of one or more brokers. Producers write records, brokers durably store them in partitioned, replicated logs, and consumers read them back at their own pace. A subset of brokers (or dedicated nodes) form the KRaft controller quorum, which manages cluster metadata. Everything else — Kafka Connect, Kafka Streams, Schema Registry — is built on top of this core using the same client protocol.

                          ┌──────────────────────── KAFKA CLUSTER ────────────────────────┐
                          │                                                                │
   ┌───────────┐          │   ┌─────────── KRaft Controller Quorum (Raft) ───────────┐    │
   │ Producer  │──write──▶│   │   ctrl-1 (leader)   ctrl-2 (follower)   ctrl-3 (...)  │    │
   │ (app)     │          │   └───────────────────────────────────────────────────────┘  │
   └───────────┘          │            ▲ metadata: topics, partitions, ISR, configs        │
                          │            │                                                    │
   ┌───────────┐          │   ┌────────┴───────┐  ┌────────────────┐  ┌────────────────┐  │
   │ Producer  │──write──▶│   │   Broker 1     │  │   Broker 2     │  │   Broker 3     │  │
   └───────────┘          │   │ ┌────────────┐ │  │ ┌────────────┐ │  │ ┌────────────┐ │  │
                          │   │ │ orders-P0  │L│  │ │ orders-P0  │F│  │ │ orders-P1  │L│  │
                          │   │ │ orders-P1  │F│  │ │ orders-P2  │L│  │ │ orders-P2  │F│  │
   ┌───────────┐          │   │ └────────────┘ │  │ └────────────┘ │  │ └────────────┘ │  │
   │ Consumer  │◀──read───│   └────────────────┘  └────────────────┘  └────────────────┘  │
   │ Group A   │          │       L = leader replica      F = follower replica             │
   └───────────┘          └────────────────────────────────────────────────────────────┘
        ▲                                                                       ▲
        │                                                                       │
   ┌────┴─────┐   ┌──────────────┐   ┌──────────────┐   ┌──────────────────────┴───────┐
   │ Consumer │   │ Kafka Connect│   │ Kafka Streams│   │ Schema Registry (Avro/Proto)  │
   │ Group B  │   │  (sources/   │   │  (stream     │   │  + REST Proxy, ksqlDB, etc.   │
   └──────────┘   │   sinks)     │   │   processing)│   └───────────────────────────────┘
                  └──────────────┘   └──────────────┘

Brokers and the cluster

A broker is a single Kafka server process. It receives writes from producers, persists them to disk, serves reads to consumers, and replicates data to and from its peers. Brokers are largely stateless from a coordination standpoint — they discover the cluster layout from metadata rather than talking to each other directly for topology. You scale throughput and storage horizontally by adding brokers; partitions are spread across them so no single node is a bottleneck.

Each broker has a unique integer node.id and advertises a set of listeners so clients can reach it.

# server.properties (KRaft combined broker+controller, single-node dev)
process.roles=broker,controller
node.id=1
controller.quorum.voters=1@localhost:9093
listeners=PLAINTEXT://:9092,CONTROLLER://:9093
advertised.listeners=PLAINTEXT://localhost:9092
log.dirs=/var/lib/kafka/data

The controller and KRaft

Modern Kafka manages metadata with KRaft (Kafka Raft) — no ZooKeeper required. A small, odd-numbered quorum of controller nodes runs the Raft consensus protocol to maintain the authoritative metadata log: which topics and partitions exist, which broker leads each partition, the in-sync replica (ISR) set, ACLs, and dynamic configs. One controller is the active leader; the rest are hot followers ready to take over.

KRaft replaced ZooKeeper as the default in production and is the only metadata mode supported going forward. Always run an odd number of controllers (typically 3 or 5) so the quorum can tolerate failures and avoid split-brain.

Topics, partitions, and offsets

A topic is a named stream of records (e.g. orders). Each topic is split into one or more partitions, and each partition is an ordered, immutable, append-only log. Every record in a partition gets a monotonically increasing offset. Ordering is guaranteed within a partition, not across partitions — so the partition is also the unit of parallelism.

Records are routed to partitions by the producer: a record with a key is hashed to a stable partition (so all events for the same key stay ordered), while keyless records are spread for balance.

Replication and high availability

Every partition is replicated to replication.factor brokers. One replica is the leader (it handles all reads and writes for that partition) and the others are followers that copy the leader’s log. Followers that are caught up form the in-sync replicas (ISR). If a leader fails, the controller promotes an in-sync follower, so no acknowledged data is lost — provided producers use acks=all.

Setting	Typical value	Why it matters
`replication.factor`	3	Tolerates the loss of one broker without data loss
`min.insync.replicas`	2	Rejects writes if too few replicas are in sync
`acks` (producer)	`all`	Waits for the full ISR before acknowledging

Producers and consumers

Producers publish records to topics. They batch records, optionally compress them, and can be configured for idempotent or transactional, exactly-once delivery. Consumers subscribe to topics and pull records, tracking their position with committed offsets stored in the internal __consumer_offsets topic.

Consumers organize into consumer groups. Kafka assigns each partition to exactly one consumer in a group, so adding consumers (up to the partition count) scales read throughput, while a separate group sees the same data independently — enabling fan-out.

var props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "order-processor");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");

try (var consumer = new KafkaConsumer<String, String>(props)) {
    consumer.subscribe(List.of("orders"));
    while (true) {
        var records = consumer.poll(Duration.ofMillis(500));
        for (var record : records) {
            System.out.printf("p=%d off=%d key=%s%n",
                record.partition(), record.offset(), record.key());
        }
    }
}

Output:

p=0 off=15 key=order-1042
p=1 off=08 key=order-1043
p=0 off=16 key=order-1042

The surrounding ecosystem

The core log is intentionally minimal; richer capabilities live in companion projects that speak the same protocol. Kafka Connect moves data in and out of Kafka via reusable source and sink connectors with no custom code. Kafka Streams is a Java library for stateful stream processing — joins, aggregations, windowing — directly on topics. Schema Registry stores and enforces Avro, Protobuf, or JSON schemas so producers and consumers evolve safely, and tools like ksqlDB and the REST Proxy round out the platform.

Best practices

Run KRaft controllers as a dedicated, odd-sized quorum (3 or 5) in production and keep them separate from heavily loaded brokers.
Set replication.factor=3 and min.insync.replicas=2 for durable topics, and use acks=all on producers to avoid silent data loss.
Choose partition counts up front based on target throughput and consumer parallelism — increasing partitions later changes key-to-partition mapping.
Pick meaningful record keys so related events stay ordered in the same partition; only go keyless when ordering truly does not matter.
Use one consumer group per logical application, and never run more consumers than partitions (the extras sit idle).
Push integration and transformation work into Connect and Streams instead of hand-rolling plumbing in your services.