Deploying Node.js on Kubernetes

Once your Node.js app is packaged as a container image, Kubernetes (k8s) becomes the engine that runs it reliably at scale. It schedules your containers across a fleet of machines, restarts ones that crash, rolls out new versions without downtime, and adds or removes replicas as traffic shifts. The trade-off is that you describe everything declaratively in YAML manifests — Deployments, Services, probes, and config objects — and let the cluster reconcile reality to match. This page walks through the manifests that turn a single Node.js image into a resilient, auto-scaling service.

The Deployment manifest

A Deployment is the core object: it declares how many identical replicas of your container should run and which image they use. Kubernetes creates Pods (one or more containers sharing a network namespace) to satisfy that desired state and recreates them if they die. You never edit a Pod directly — you change the Deployment and let the controller roll the change out.

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: node-api
  labels:
    app: node-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: node-api
  template:
    metadata:
      labels:
        app: node-api
    spec:
      containers:
        - name: node-api
          image: registry.example.com/node-api:1.4.0
          ports:
            - containerPort: 8080
          env:
            - name: NODE_ENV
              value: "production"
            - name: PORT
              value: "8080"

Pin a concrete image tag (1.4.0), never latest — Kubernetes uses the tag to detect changes, and a moving tag makes rollouts and rollbacks unpredictable. Apply the manifest with kubectl:

kubectl apply -f deployment.yaml
kubectl rollout status deployment/node-api

Output:

deployment.apps/node-api created
Waiting for deployment "node-api" rollout to finish: 3 of 3 updated replicas are available...
deployment "node-api" successfully rolled out

Exposing the app with a Service

Pods are ephemeral and get fresh IPs on every restart, so you never talk to them directly. A Service gives the set of Pods a stable virtual IP and DNS name, load-balancing requests across all healthy replicas selected by label.

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: node-api
spec:
  selector:
    app: node-api
  ports:
    - port: 80
      targetPort: 8080
  type: ClusterIP

A ClusterIP Service is reachable inside the cluster at http://node-api. To accept external traffic, front it with an Ingress (HTTP routing and TLS) or use a LoadBalancer Service on a cloud provider.

Liveness and readiness probes

Probes are how Kubernetes knows whether your container is healthy. A liveness probe that fails causes the container to be restarted; a readiness probe that fails removes the Pod from the Service load balancer without killing it. Use readiness to gate traffic during startup or while a dependency (like the database) is temporarily unavailable.

          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /readyz
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5

Keep the endpoints cheap and distinct. Liveness should answer “is the event loop alive?”, while readiness should answer “can I serve a real request right now?”:

import { createServer } from "node:http";
import { isDbConnected } from "./db.js";

const server = createServer((req, res) => {
  if (req.url === "/healthz") {
    res.writeHead(200).end("ok");
    return;
  }
  if (req.url === "/readyz") {
    const ready = isDbConnected();
    res.writeHead(ready ? 200 : 503).end(ready ? "ready" : "not ready");
    return;
  }
  res.writeHead(404).end();
});

server.listen(Number(process.env.PORT) || 8080, "0.0.0.0");

Don’t point the liveness probe at a route that checks the database. If the DB blips, every replica fails liveness, gets restarted simultaneously, and you turn a brief outage into a crash loop. Database health belongs in readiness.

Resource requests and limits

Each container should declare how much CPU and memory it needs. Requests drive scheduling (Kubernetes places the Pod on a node with that capacity free); limits cap usage. A container exceeding its memory limit is OOM-killed, and CPU above the limit is throttled.

          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "256Mi"

100m means 0.1 of a CPU core. Because the V8 heap doesn’t know about cgroup limits by default, set the old-space size relative to the memory limit so Node.js triggers garbage collection before Kubernetes kills it:

            - name: NODE_OPTIONS
              value: "--max-old-space-size=192"

Horizontal pod autoscaling

A HorizontalPodAutoscaler (HPA) changes the replica count automatically based on observed metrics — most commonly CPU. It only works when your Deployment defines CPU requests, since the target is a percentage of the request.

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: node-api
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: node-api
  minReplicas: 3
  maxReplicas: 12
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

kubectl apply -f hpa.yaml
kubectl get hpa node-api

Output:

NAME       REFERENCE             TARGETS    MINPODS   MAXPODS   REPLICAS
node-api   Deployment/node-api   42%/70%    3         12        3

When average CPU climbs past 70%, the HPA adds Pods up to the maximum; when traffic falls, it scales back toward the minimum. Because each Node.js process is single-threaded, run one process per container and let the HPA add replicas rather than embedding the cluster module.

Configuration with ConfigMaps and Secrets

Keep configuration out of the image so one artifact ships to every environment. A ConfigMap holds non-sensitive settings; a Secret holds credentials (base64-encoded at rest, and ideally encrypted via your provider’s secret store or an external manager).

apiVersion: v1
kind: ConfigMap
metadata:
  name: node-api-config
data:
  LOG_LEVEL: "info"
  FEATURE_FLAGS: "search,beta-ui"
---
apiVersion: v1
kind: Secret
metadata:
  name: node-api-secrets
type: Opaque
stringData:
  DATABASE_URL: "postgres://user:pass@db:5432/app"

Inject both into the container as environment variables so your existing process.env code keeps working unchanged:

          envFrom:
            - configMapRef:
                name: node-api-config
            - secretRef:
                name: node-api-secrets

Changing a ConfigMap or Secret does not restart Pods automatically when consumed as envFrom. Trigger a fresh rollout with kubectl rollout restart deployment/node-api to pick up new values.

Best Practices

Deploy via a Deployment with multiple replicas; never manage bare Pods, and always pin a concrete image tag.
Define separate liveness and readiness probes — keep liveness dependency-free and put downstream checks in readiness.
Set CPU and memory requests and limits, and align --max-old-space-size with the memory limit to avoid OOM kills.
Scale horizontally with an HPA on CPU (or custom metrics); run one Node.js process per container.
Externalize config in ConfigMaps and secrets in Secrets, injected via envFrom, so a single image promotes across environments.
Configure a rolling update strategy and use kubectl rollout undo for instant rollbacks when a release misbehaves.