API Gateway Pattern

In a microservices architecture, clients should never talk to dozens of services directly. An API gateway is a single entry point that sits in front of your services, routing each request to the right place, aggregating responses, and enforcing cross-cutting concerns like authentication and rate limiting at the edge. This keeps clients simple, hides your internal topology, and gives you one place to apply policy across the whole system.

Why a single entry point

Without a gateway, every client needs to know the network location of every service, handle each service’s auth scheme, and stitch together data from multiple calls. That couples clients tightly to your backend and leaks internal structure. A gateway solves this by acting as a reverse proxy: clients see one stable URL and protocol, while the gateway forwards traffic internally and absorbs change.

Concern	Without gateway	With gateway
Service discovery	Client knows every host	Hidden behind gateway
Authentication	Repeated per service	Validated once at the edge
Rate limiting	Inconsistent	Centralised policy
Aggregation	Client makes N calls	Gateway composes one response
Protocol/versioning	Exposed to clients	Translated internally

Request routing

The core job of a gateway is routing: map an incoming path to an upstream service. The http-proxy-middleware package makes this concise on top of Express.

import express from "express";
import { createProxyMiddleware } from "http-proxy-middleware";

const app = express();

const services = {
  "/api/users": "http://users-service:3001",
  "/api/orders": "http://orders-service:3002",
  "/api/catalog": "http://catalog-service:3003",
};

for (const [path, target] of Object.entries(services)) {
  app.use(
    path,
    createProxyMiddleware({
      target,
      changeOrigin: true,
      pathRewrite: { [`^${path}`]: "" },
    })
  );
}

app.listen(8080, () => console.log("Gateway listening on :8080"));

A request to GET /api/orders/42 is forwarded to http://orders-service:3002/42. The client never learns that orders-service exists.

Tip: Keep route definitions data-driven (a map or config file) rather than hard-coded handlers. This lets you add or relocate services without touching gateway logic, and makes the routing table easy to audit.

Authentication and rate limiting at the edge

Validating credentials once at the gateway means downstream services can trust an authenticated identity passed along in a header. Combine a JWT check with a rate limiter so abusive clients are stopped before they reach your services.

import express from "express";
import { rateLimit } from "express-rate-limit";
import jwt from "jsonwebtoken";

const app = express();

const limiter = rateLimit({
  windowMs: 60_000,
  limit: 100,
  standardHeaders: "draft-7",
  legacyHeaders: false,
});

app.use(limiter);

app.use((req, res, next) => {
  const header = req.headers.authorization ?? "";
  const token = header.startsWith("Bearer ") ? header.slice(7) : null;
  if (!token) return res.status(401).json({ error: "missing token" });

  try {
    const payload = jwt.verify(token, process.env.JWT_SECRET);
    req.headers["x-user-id"] = payload.sub;
    next();
  } catch {
    res.status(401).json({ error: "invalid token" });
  }
});

Downstream services read x-user-id and skip token verification entirely, trusting the gateway as the security boundary (services must still be unreachable from outside the cluster).

Response aggregation

Aggregation lets one client call replace several. The gateway fans out to multiple services in parallel with Promise.all and composes a single response, cutting round trips for the client.

import express from "express";

const app = express();

app.get("/api/dashboard/:userId", async (req, res) => {
  const { userId } = req.params;

  try {
    const [user, orders, recommendations] = await Promise.all([
      fetch(`http://users-service:3001/${userId}`).then((r) => r.json()),
      fetch(`http://orders-service:3002/by-user/${userId}`).then((r) => r.json()),
      fetch(`http://catalog-service:3003/recommend/${userId}`).then((r) => r.json()),
    ]);

    res.json({ user, orders, recommendations });
  } catch (err) {
    res.status(502).json({ error: "upstream failure", detail: err.message });
  }
});

app.listen(8080);

Output:

GET /api/dashboard/7
{
  "user": { "id": "7", "name": "Ada Lovelace" },
  "orders": [ { "id": 42, "total": 89.0 } ],
  "recommendations": [ { "sku": "BK-1843", "title": "Notes on the Engine" } ]
}

Warning: A naive aggregator fails entirely if one upstream is down. Wrap individual calls and return partial data with Promise.allSettled when the UI can tolerate missing sections, rather than failing the whole request.

Backend-for-Frontend (BFF)

A single gateway often struggles to serve very different clients well: a mobile app wants small, tailored payloads, while a web SPA wants richer data. The Backend-for-Frontend variant runs a dedicated gateway per client type. Each BFF aggregates and shapes responses for exactly one frontend, owned by the team that builds that frontend.

            +-----------------+
 Mobile --> |  Mobile BFF     | --\
            +-----------------+    \
                                    >--> users / orders / catalog
            +-----------------+    /
 Web    --> |  Web BFF        | --/
            +-----------------+

This avoids a bloated one-size-fits-all gateway and lets each client evolve its API independently, at the cost of some duplicated routing logic.

Best Practices

Treat the gateway as dumb plumbing: routing, auth, aggregation, and rate limiting belong here — business logic does not.
Add timeouts and circuit breakers on every upstream call so a slow service cannot exhaust the gateway’s resources.
Propagate a correlation ID header (e.g. x-request-id) on entry so requests can be traced across services.
Use Promise.allSettled for aggregation when partial responses are acceptable, and fail fast only when all data is required.
Keep the gateway stateless so you can scale it horizontally behind a load balancer.
Run a BFF per client only when client needs genuinely diverge; otherwise a shared gateway is simpler to operate.
Never expose internal services directly — the gateway must be the only public ingress.