Core Concepts & Glossary

Kafka has a small but dense vocabulary, and every term carries operational weight: misunderstanding what an offset or an ISR actually means is how teams end up with data loss, stuck consumers, or under-replicated partitions in production. This page is a quick-reference glossary of the core concepts you will meet on every other page of these docs. Each term gets one or two precise sentences; skim it now, then bookmark it for the moment a config key or a log line stops making sense.

Cluster and node terms

These describe the physical and logical layout of a Kafka deployment.

Term	Definition
Cluster	A group of one or more cooperating brokers that share the same metadata and together store all topics.
Broker	A single Kafka server process that stores partition data on disk and serves produce/fetch requests. Each broker has a unique numeric `node.id`.
Controller	The broker (or dedicated node) that owns cluster metadata — topic creation, partition assignment, leader election. In modern Kafka, controllers form a Raft quorum (KRaft).
KRaft	Kafka Raft — the built-in consensus protocol that stores metadata in an internal `__cluster_metadata` log, replacing ZooKeeper. The default since Kafka 3.3 and the only mode from 4.0 onward.
ZooKeeper	The legacy external coordination service Kafka used to store metadata before KRaft. Removed entirely in Kafka 4.0; you should not deploy it for new clusters.

KRaft vs ZooKeeper is the single biggest architectural shift in Kafka’s history. If you are starting fresh, run KRaft. Only touch ZooKeeper concepts when migrating or maintaining a pre-3.x cluster.

Topic and storage terms

This group covers how records are organized and durably stored.

Term	Definition
Topic	A named, append-only category of records (e.g. `orders`). Topics are logical; the physical unit is the partition.
Partition	An ordered, immutable, append-only log that is the unit of parallelism and ordering. A topic is split into one or more partitions, each identified as `topic-N`.
Offset	A monotonically increasing 64-bit integer that uniquely identifies a record’s position within a partition. Offsets are per-partition, never global.
Replica	A copy of a partition stored on a broker. The `replication.factor` controls how many copies exist; replicas are how Kafka survives broker failure.
Leader	The single replica of a partition that handles all reads and writes at a given time. Producers and consumers always talk to the leader.
Follower	A replica that passively fetches records from the leader to stay in sync. A follower is promoted to leader if the current leader fails.
ISR	In-Sync Replicas — the set of replicas (leader + followers) that are fully caught up with the leader. Only ISR members are eligible to become leader.
High watermark	The highest offset that has been replicated to all ISR members. Consumers can only read up to the high watermark, which guarantees they never see unreplicated (potentially lost) data.
Retention	The policy that decides when old records are deleted, by time (`retention.ms`) or size (`retention.bytes`).
Log compaction	A retention mode (`cleanup.policy=compact`) that keeps only the latest record per key, ideal for changelog/state topics rather than time-bounded event streams.

You can inspect a partition’s leader and ISR directly:

kafka-topics.sh --bootstrap-server localhost:9092 \
  --describe --topic orders

Output:

Topic: orders   PartitionCount: 3   ReplicationFactor: 3
  Topic: orders   Partition: 0   Leader: 1   Replicas: 1,2,3   Isr: 1,2,3
  Topic: orders   Partition: 1   Leader: 2   Replicas: 2,3,1   Isr: 2,3,1
  Topic: orders   Partition: 2   Leader: 3   Replicas: 3,1,2   Isr: 3,1

A partition whose Isr is smaller than its Replicas is under-replicated — a follower has fallen behind or its broker is down. Watch UnderReplicatedPartitions as a top-tier alert.

Client terms

These describe the applications that read and write data, and how they coordinate.

Term	Definition
Producer	A client that publishes records to topic partitions. Partition choice is driven by the record key (hash) or a custom partitioner.
Consumer	A client that subscribes to topics and reads records in offset order, committing its progress so it can resume after a restart.
Consumer group	A set of consumers sharing a `group.id` that cooperatively divide a topic’s partitions, with each partition consumed by exactly one member at a time. This is how you scale consumption horizontally.
Rebalance	The process of reassigning partitions across a consumer group’s members when a consumer joins, leaves, or fails. During a rebalance, consumption briefly pauses.
Lag	The difference between the latest offset in a partition and a consumer group’s committed offset — i.e. how far behind a consumer is. The key health metric for any consumer.
acks	The producer durability setting: `acks=0` (fire-and-forget), `acks=1` (leader only), `acks=all` (all ISR confirmed). Use `acks=all` whenever you cannot afford to lose records.

A minimal producer config showing the durability and identity keys above:

var props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.ACKS_CONFIG, "all");          // wait for full ISR
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);

try (var producer = new KafkaProducer<String, String>(props)) {
    producer.send(new ProducerRecord<>("orders", "order-42", "{\"id\":42}"));
}

To check a consumer group’s lag from the CLI:

kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --describe --group order-processor

Best practices

Treat offsets as per-partition: never assume ordering or numbering carries across partitions of a topic.
Run acks=all plus min.insync.replicas=2 with replication.factor=3 for any topic you cannot afford to lose.
Monitor consumer lag and under-replicated partitions as primary SLO signals; both surface problems before users notice.
Keep rebalances rare and fast by using cooperative-sticky assignment and tuning session.timeout.ms / max.poll.interval.ms to your real workload.
Choose retention vs. compaction deliberately: time/size retention for event streams, compaction for keyed state and changelogs.
Deploy new clusters on KRaft; do not introduce ZooKeeper into any greenfield system.

Core Concepts & Glossary

Cluster and node terms

Topic and storage terms

Client terms

Best practices

Related Topics