Navigation

AWS aws architecture 6 min read

Caching Strategies

Caching means keeping a copy of expensive-to-produce data somewhere fast so you don’t have to rebuild or refetch it every time. In a cloud system you rarely cache in just one place — you cache at the edge (close to users), in your application layer, and in front of your database. Done well, caching cuts latency from hundreds of milliseconds to single digits and slashes your database bill. Done badly, it serves stale data and falls over under load. This page walks the three layers and the patterns that keep caching from biting you.

The three caching layers

Think of a request travelling from a user’s browser to your data. You can short-circuit it at three points.

Layer	AWS service	Caches	When to use
Edge	CloudFront (a content delivery network, or CDN)	Static files, images, API responses	Global users, public or cacheable content
Application / database	ElastiCache (managed Redis or Memcached)	Query results, sessions, computed values	Hot data read far more than it changes
DynamoDB	DAX (DynamoDB Accelerator)	DynamoDB item reads	Read-heavy DynamoDB tables needing microsecond reads

You can and often should use all three together. Each one removes load from the layer behind it.

CloudFront: caching at the edge

CloudFront is AWS’s CDN — a global network of edge locations that hold copies of your content near your users. A user in Tokyo gets served from a Tokyo edge location instead of crossing an ocean to your origin (the source server, like an S3 bucket or a load balancer).

When to use this: any publicly cacheable content — images, CSS, JavaScript, videos, and even GET API responses that are the same for many users. When NOT to: highly personalised, per-user responses (unless you cache by a key like a cookie), or anything that must always be perfectly fresh.

Console steps

Open the CloudFront console and click Create distribution.
Set Origin domain to your S3 bucket or load balancer DNS name.
Under Cache key and origin requests, pick a cache policy. CachingOptimized is the sensible default for static assets.
Choose a Price class (cheaper classes skip the most expensive regions).
Click Create distribution and wait for Deployed status.

CLI equivalent

aws cloudfront create-distribution \
  --origin-domain-name my-assets.s3.amazonaws.com \
  --default-root-object index.html

Output:

{
  "Distribution": {
    "Id": "E1A2B3C4D5E6F7",
    "DomainName": "d111111abcdef8.cloudfront.net",
    "Status": "InProgress"
  }
}

Invalidation

When you change a file but its TTL (time to live — how long a cached copy is considered fresh) hasn’t expired, CloudFront keeps serving the old one. You invalidate to force a refresh:

aws cloudfront create-invalidation \
  --distribution-id E1A2B3C4D5E6F7 \
  --paths "/css/*" "/index.html"

Cost note: the first 1,000 invalidation paths per month are free; after that each path costs about $0.005. Don’t invalidate /* on every deploy — instead add a content hash to filenames (e.g. app.9f8e7d.js) so new files have new URLs and old ones simply expire.

ElastiCache: caching at the app and database layer

ElastiCache is managed Redis or Memcached running in your VPC (Virtual Private Cloud — your private network in AWS). Your application reads from it instead of hitting the database for hot data. The two core patterns are cache-aside and write-through.

Cache-aside (lazy loading)

Your app checks the cache first. On a miss, it reads the database, stores the result in the cache, then returns it. The cache only ever holds data someone actually asked for.

read(key):
  value = cache.get(key)
  if value is null:                 # cache miss
      value = db.query(key)
      cache.set(key, value, ttl=300)
  return value

When to use this: the default for most read-heavy workloads. Downside: the first read of any key is always slow (a miss), and the cache can hold stale data until the TTL expires.

Write-through

Every time you write to the database, you also write to the cache in the same operation. The cache is never stale for data that’s been written, but you cache data that may never be read.

When to use this: data that is written and then read back almost immediately, where staleness is unacceptable. Often combined with cache-aside.

Pattern	Cache freshness	Write cost	Wasted cache space
Cache-aside	Stale until TTL	Cheap	Low (only read data)
Write-through	Always fresh	Extra write each time	Higher (caches unread data)

Creating an ElastiCache Redis cluster

aws elasticache create-cache-cluster \
  --cache-cluster-id app-cache \
  --engine redis \
  --cache-node-type cache.t4g.micro \
  --num-cache-nodes 1 \
  --security-group-ids sg-0a1b2c3d

Output:

{
  "CacheCluster": {
    "CacheClusterId": "app-cache",
    "CacheClusterStatus": "creating",
    "Engine": "redis",
    "CacheNodeType": "cache.t4g.micro"
  }
}

Console: open ElastiCache > Redis caches > Create, pick a node type, choose your VPC and a security group, then create. Cost note: a single cache.t4g.micro node runs roughly $12 per month — far cheaper than the database reads it replaces.

DAX: caching for DynamoDB

DAX (DynamoDB Accelerator) is a write-through cache built specifically for DynamoDB. Your app talks to DAX using the normal DynamoDB API, and DAX returns cached reads in microseconds instead of the single-digit milliseconds DynamoDB itself takes.

When to use this: read-heavy DynamoDB tables with the same items read over and over (product catalogues, leaderboards). When NOT to: write-heavy tables, strongly-consistent reads (DAX serves eventually-consistent reads), or apps already using ElastiCache for the same data.

aws dax create-cluster \
  --cluster-name orders-dax \
  --node-type dax.t3.small \
  --replication-factor 3 \
  --iam-role-arn arn:aws:iam::111122223333:role/DaxRole \
  --subnet-group-name dax-subnets

Choosing TTLs and surviving the cache stampede

The hardest problem in caching is invalidation — knowing when a cached value is no longer true. Two failure modes hurt most.

Stale data: your TTL is too long and users see old prices or counts. Shorten the TTL, or use write-through / active invalidation for data that must be fresh.

Cache stampede (thundering herd): a popular key expires, and thousands of requests miss the cache at the same instant and all hammer the database to rebuild it — exactly when it’s busiest. Three defences:

Add jitter to TTLs. Instead of a fixed 300 seconds, use 300 + random(0, 60) so keys don’t all expire together.
Use a lock. The first request to miss takes a short lock (e.g. a Redis SET NX key) and rebuilds the value; others briefly serve the old value or wait.
Degrade gracefully. If ElastiCache is unreachable, your code must fall through to the database, not crash. Treat the cache as an optimisation, never as the source of truth.

Warning: never let a cache outage take down your whole app. Wrap cache calls in try/catch (or short timeouts) so a missing cache means “slower”, not “down”.

Best practices

Cache the cheapest, hottest data first — measure your slow queries before adding cache everywhere.
Always set a TTL; an entry with no expiry is a future stale-data incident.
Add jitter to TTLs to prevent synchronised expiry and stampedes.
Pick cache-aside as your default; layer write-through only where staleness is unacceptable.
Use content hashes in CloudFront filenames so you almost never need invalidations.
Make the cache optional in code: a cache miss or outage must fall through to the origin.
Put ElastiCache and DAX inside your VPC with tight security groups — never expose a cache to the internet.