What is Container Orchestration?

When you run one container on one server, life is simple: you start it, you watch the logs, you restart it if it crashes. But the moment you have dozens of containers spread across several machines, doing all of that by hand becomes impossible. Container orchestration is the practice of using a tool to automatically place, run, connect, heal, and scale containers across a group of machines. This page explains the real problems orchestration solves, introduces the two main options (Kubernetes and Docker Swarm), and sets honest expectations about the complexity you are signing up for.

A quick recap: what is a container?

A container is a lightweight, isolated package that holds your application plus everything it needs to run (code, libraries, runtime). You usually build containers with Docker. A container is not a virtual machine: it shares the host’s Linux kernel, so it starts in milliseconds and uses far fewer resources.

Running a single container on Ubuntu is easy:

docker run -d --name web -p 80:80 nginx:1.27
docker ps

Output:

CONTAINER ID   IMAGE         COMMAND                  STATUS         PORTS                NAMES
8f3c1a2b9d44   nginx:1.27    "/docker-entrypoint.…"   Up 12 seconds  0.0.0.0:80->80/tcp   web

This works great for one app on one box. The trouble starts when one box is no longer enough.

The problems orchestration solves

Imagine you now run 40 containers across 5 Ubuntu servers. Here is what suddenly gets hard, and what an orchestrator does for you.

Scheduling (where does each container run?)

Scheduling means deciding which machine each container should run on. If server A has plenty of free memory and server B is full, a new container should go to server A. Doing this in your head across 5 servers is error-prone. An orchestrator watches the free CPU and memory on every machine and places each container automatically.

When to use this: as soon as you have more than one server hosting containers. With a single server, the kernel already schedules processes for you and an orchestrator is overkill.

Scaling (run more copies under load)

Scaling means changing how many copies (called replicas) of your app are running. On Black Friday you might want 20 copies of your checkout service; at 3 a.m. you might want 3. An orchestrator lets you say “I want 20 replicas” and it makes that true, spreading them across machines. Many can even add or remove replicas automatically based on CPU usage (autoscaling).

Self-healing (restart and replace failures)

If a container crashes, an orchestrator restarts it. If an entire server dies, the orchestrator notices the containers that were on it are gone and re-creates them on the surviving servers. This self-healing is the feature that lets you sleep at night. Without it, a 2 a.m. crash means a 2 a.m. phone call.

Networking and service discovery

With containers moving between machines and getting new IP addresses constantly, how does your web app find your database? An orchestrator gives each group of containers a stable internal name (this is service discovery — looking up a service by name instead of IP). Your app just connects to database and the orchestrator routes traffic to whichever containers are currently healthy. It also handles load balancing (spreading incoming requests evenly across the replicas).

Rollouts and rollbacks

When you ship a new version, you do not want all 20 replicas to switch at once — a bug could take down everything. A rolling update replaces replicas a few at a time, checking each new one is healthy before moving on. If something goes wrong, a rollback returns you to the previous version with one command.

The two main options

Tool	Best for	Complexity	Notes
Kubernetes	Production at scale, teams, multi-cloud	High	The industry standard; huge ecosystem; steep learning curve
Docker Swarm	Small teams, simple multi-host setups	Low	Built into Docker; fewer features but far easier to learn

Kubernetes (often shortened to K8s — “K”, then 8 letters, then “s”) is the de-facto standard. Nearly every cloud provider (AWS, Google Cloud, Azure) offers a managed Kubernetes service, and the ecosystem of add-ons is enormous. The cost is complexity: there are many concepts to learn before you are productive.

Docker Swarm is Docker’s own built-in orchestrator. You turn a group of Docker hosts into a cluster with a single command. It does scheduling, scaling, self-healing, and rolling updates with a fraction of the concepts Kubernetes requires.

You can initialise a Swarm cluster on Ubuntu in seconds:

sudo docker swarm init --advertise-addr 192.168.1.10

Output:

Swarm initialized: current node (x9k2...) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-49nj1... 192.168.1.10:2377

Be honest about the complexity

Orchestration is powerful, but it is not free. Kubernetes in particular adds a large amount of moving parts: a control plane to maintain, networking plugins to configure, and a vocabulary of new terms (Pods, Deployments, Services, Ingress). For a hobby project or a single small app, plain Docker — or even just systemd running a container — is often the right answer.

Gotcha: do not reach for Kubernetes just because it is popular. Running it yourself means securing and upgrading the control plane, which is a serious ongoing job. If you genuinely need it, prefer a managed service so the provider handles that part for you.

A good rule of thumb: stay with single-host Docker until you feel real pain (you have outgrown one machine, you need zero-downtime deploys across machines, or you need automatic recovery from server failure). Let the pain pull you toward orchestration, rather than adopting it speculatively.

Best Practices

Start small. Run containers on a single host with Docker and systemd until you actually outgrow one machine.
Choose the simplest tool that fits. Try Docker Swarm before Kubernetes if your needs are modest.
Prefer managed Kubernetes (EKS, GKE, AKS) over self-hosting so you are not responsible for the control plane.
Always define health checks (readiness and liveness probes) so self-healing and rolling updates actually work.
Keep your container images small and pinned to a specific version tag (e.g. nginx:1.27, never latest) for predictable rollouts.
Treat your cluster configuration as code: store every manifest or compose file in Git so deployments are reproducible.