Caching for Performance

Caching is the cheapest latency you will ever buy: serving a stored result avoids recomputation, repeated database round-trips, and redundant serialization. The trick is that caching is never one thing — it is a stack of independent layers, each with its own scope, lifetime, and invalidation rules. A well-tuned NestJS service typically combines an HTTP cache at the edge, an application cache in Redis, and query result caching at the data layer. This page shows how to apply each layer and, just as importantly, how to decide what not to cache.

The caching layers

Think of a request travelling from the browser down to the database. Each hop is an opportunity to short-circuit. The closer to the client you cache, the cheaper the hit — but the harder invalidation becomes, because you no longer control the copy.

Layer	Where it lives	Typical TTL	Best for
HTTP cache	Browser, CDN, reverse proxy	seconds-hours	Public, mostly-static GET responses
Application cache	Redis / in-memory	seconds-minutes	Expensive computed values, hot lookups
Query result cache	ORM / database	seconds	Repeated identical SQL within a request burst

The rule of thumb: cache as close to the client as the data’s volatility allows, and never cache user-private data in a shared layer.

HTTP caching with Cache-Control and ETag

HTTP caching lets browsers and CDNs reuse responses without touching your server at all. You control it with the Cache-Control header for freshness, and ETag for cheap revalidation. NestJS exposes the response object so you can set both, but for ETag it is cleaner to let the framework hash the body.

import { Controller, Get, Header, Param } from '@nestjs/common';
import { ArticlesService } from './articles.service';

@Controller('articles')
export class ArticlesController {
  constructor(private readonly articles: ArticlesService) {}

  @Get(':slug')
  // public: any cache may store it; max-age in seconds; SWR keeps serving stale
  @Header('Cache-Control', 'public, max-age=60, stale-while-revalidate=300')
  findOne(@Param('slug') slug: string) {
    return this.articles.findBySlug(slug);
  }
}

Enable automatic ETag generation once, in main.ts. With ETags on, a client that already has the resource sends If-None-Match, and the server answers 304 Not Modified with an empty body when nothing changed.

const app = await NestFactory.create(AppModule);
app.getHttpAdapter().getInstance().set('etag', 'strong');
await app.listen(3000);

Output:

$ curl -i http://localhost:3000/articles/intro
HTTP/1.1 200 OK
Cache-Control: public, max-age=60, stale-while-revalidate=300
ETag: "a3f9c1e8"

$ curl -i -H 'If-None-Match: "a3f9c1e8"' http://localhost:3000/articles/intro
HTTP/1.1 304 Not Modified
ETag: "a3f9c1e8"

Never send Cache-Control: public on responses that depend on the authenticated user. A shared CDN can serve one user’s data to another. Use private (browser only) or no-store for anything personalized.

Application caching with Redis

For values that are expensive to compute but identical across users — a leaderboard, an exchange rate, an aggregated report — cache the result in Redis so every instance of your service shares it. NestJS ships @nestjs/cache-manager; pair it with the Keyv Redis store.

npm install @nestjs/cache-manager cache-manager @keyv/redis

import { Module } from '@nestjs/common';
import { CacheModule } from '@nestjs/cache-manager';
import { createKeyv } from '@keyv/redis';

@Module({
  imports: [
    CacheModule.registerAsync({
      isGlobal: true,
      useFactory: () => ({
        stores: [createKeyv('redis://localhost:6379')],
        ttl: 30_000, // default TTL in milliseconds
      }),
    }),
  ],
})
export class AppModule {}

Inject the cache manager and apply the cache-aside pattern: check the cache, fall through to the source on a miss, then populate the cache for next time.

import { Inject, Injectable } from '@nestjs/common';
import { CACHE_MANAGER } from '@nestjs/cache-manager';
import { Cache } from 'cache-manager';
import { RatesGateway } from './rates.gateway';

@Injectable()
export class RatesService {
  constructor(
    @Inject(CACHE_MANAGER) private readonly cache: Cache,
    private readonly gateway: RatesGateway,
  ) {}

  async getRate(pair: string): Promise<number> {
    const key = `rate:${pair}`;
    const cached = await this.cache.get<number>(key);
    if (cached !== undefined) {
      return cached;
    }

    const fresh = await this.gateway.fetchRate(pair); // slow upstream call
    await this.cache.set(key, fresh, 15_000); // 15s TTL for volatile prices
    return fresh;
  }

  async invalidate(pair: string): Promise<void> {
    await this.cache.del(`rate:${pair}`);
  }
}

The matching write path must invalidate. Whenever a mutation changes the source of truth, delete the affected keys in the same transaction boundary so a stale read can never win the race.

Query result caching

The lowest layer caches SQL results. With TypeORM you can opt a query into the cache without touching your service logic — the ORM stores the result keyed by the query and parameters.

const orders = await this.dataSource
  .getRepository(Order)
  .createQueryBuilder('order')
  .where('order.customerId = :id', { id })
  .cache(`orders:${id}`, 5_000) // cache id + TTL in ms
  .getMany();

This is ideal for read-heavy bursts — a dashboard that fires the same aggregate query for every widget. Keep TTLs short (a few seconds); query caching is about deduplicating a stampede, not long-term storage.

Choosing TTLs and invalidation boundaries

Two questions decide every cache entry. First: how stale can this be before a user notices or it becomes wrong? That answer is your TTL. Reference data tolerates minutes; pricing or inventory tolerates seconds; a bank balance tolerates nothing — cache only the rendering, not the number.

Second: what event makes this entry wrong? That event defines your invalidation boundary. Prefer explicit deletion on write over long TTLs when correctness matters, and use TTL as a safety net for entries you cannot reliably invalidate. Group related keys under a common prefix (user:42:*) so a single mutation can clear everything it affects.

Best practices

Layer caches deliberately: edge for public reads, Redis for shared compute, ORM cache for query bursts — do not rely on a single tier.
Set Cache-Control: private or no-store on anything user-specific; only truly public data may be public.
Always pair a TTL with explicit invalidation on write; treat TTL as a backstop, not the primary strategy.
Key cache entries deterministically (include all inputs) and namespace them so bulk invalidation is a prefix scan.
Keep query-cache TTLs in the single-digit seconds — their job is stampede protection, not persistence.
Measure hit ratio in production; a cache below ~80% hits often costs more in complexity and round-trips than it saves.