When to Go Serverless
Serverless is powerful, but it is not a default you reach for on every project. The right question is not “is serverless cool?” but “does my workload’s shape match what serverless does well?” This page gives you a clear, honest framework for deciding when to pick serverless (mostly AWS Lambda, a service that runs your code without you managing servers) and when a container or a plain virtual machine will serve you better and cheaper.
What serverless is really good at
Serverless shines when your work arrives as discrete events and your traffic is uneven. You pay only for the milliseconds your code runs, and the platform scales from zero to thousands of parallel executions automatically. There are no servers to patch, no operating system to harden, and no capacity to guess at in advance.
The strongest fits are:
- Event-driven work. A file lands in Amazon S3 (Simple Storage Service, AWS object storage), a message arrives on a queue, a row changes in a database. Each event triggers one short function. This is the classic serverless sweet spot.
- Spiky or unpredictable traffic. A coupon-code endpoint that is quiet for days and then hammered during a sale. With servers you pay 24/7 for the peak; with Lambda you pay for the spike and nothing in between.
- Glue logic. Small pieces of code that connect AWS services together: resize an image, transform a record, forward a notification. These rarely justify a running server.
- Low operations overhead. A small team that wants to ship features, not run a fleet. Serverless removes most of the undifferentiated work of keeping servers alive.
Tip: Serverless is best when your priority is variability and speed-to-market. If you can ship a feature this week without thinking about server capacity, that velocity is often worth more than a marginal cost difference.
Where serverless struggles
Serverless has hard edges. Knowing them up front saves you a painful migration later.
- Steady, high throughput. If your service runs flat-out around the clock, you are essentially renting compute by the millisecond at a premium. A right-sized container or EC2 (Elastic Compute Cloud, AWS virtual servers) fleet is usually cheaper at sustained load.
- Long-running jobs. A single Lambda invocation can run for at most 15 minutes. Video encoding, large batch ETL (Extract, Transform, Load), or a long training job will hit that wall.
- Ultra-low latency. Cold starts (the delay when Lambda spins up a fresh execution environment for the first request) can add tens to hundreds of milliseconds. For consistent single-digit-millisecond responses, a warm container behind a load balancer is more predictable.
- Heavy local state. Lambda is stateless and its local disk is temporary. Workloads that keep large in-memory caches or local data between requests fit servers better.
- Large dependencies. A deployment package is capped (250 MB unzipped via layers; up to 10 GB as a container image). Huge native libraries or ML models can be awkward to fit.
The cost gotcha at high volume
This is the trap teams fall into. Serverless looks free when traffic is light, so people assume it is always cheapest. It is not.
Lambda bills per request plus GB-seconds (memory size multiplied by run time). At low or bursty volume that is a bargain. But at high sustained volume, the per-invocation price adds up and can cost several times a right-sized container or EC2 fleet doing the same work, because servers amortize their fixed cost across constant use.
| Scenario | Best fit | Why |
|---|---|---|
| 2M requests/month, spiky | Lambda | Pay only for spikes; scales to zero |
| 50M requests/month, steady | ECS/Fargate or EC2 | Constant load makes fixed-cost servers cheaper |
| Cron-style job, every few mins | Lambda | Trivial work, no idle server |
| 24/7 CPU-bound service | EC2 / containers | No cold starts, cheaper at full utilization |
| Job that runs 40 minutes | Containers / AWS Batch | Exceeds Lambda’s 15-min limit |
A rough rule: if a function is busy more than ~40-60% of the time every hour of the day, model the cost against a container before committing. The crossover varies, so always estimate with the AWS Pricing Calculator.
Warning: Do not pick serverless as a blanket default to avoid “managing servers.” At high constant volume that decision can quietly multiply your monthly bill. Match the architecture to the traffic shape, not to fashion.
Compute options compared
| Factor | Lambda (serverless) | Fargate (serverless containers) | EC2 (virtual machines) |
|---|---|---|---|
| Scale to zero | Yes | No (min 1 task) | No |
| Max run time | 15 minutes | Unlimited | Unlimited |
| Cold starts | Yes (mitigable) | Minor | None when warm |
| Ops effort | Lowest | Low-medium | Highest |
| Best traffic shape | Spiky / event-driven | Variable, longer tasks | Steady, high utilization |
| Cost at high sustained load | Highest | Medium | Lowest (if well sized) |
A quick way to check your current pattern
Before deciding, look at how your existing or expected traffic behaves. If you already run something, CloudWatch (AWS monitoring) can show you the request pattern.
Console steps
- Open the CloudWatch console and choose Metrics > All metrics.
- Pick the Lambda (or ApiGateway) namespace and select Invocations or Count.
- Set the period to 1 hour over the last 2 weeks and read the shape: flat means steady, jagged means spiky.
AWS CLI
aws cloudwatch get-metric-statistics \
--namespace AWS/Lambda \
--metric-name Invocations \
--dimensions Name=FunctionName,Value=order-processor \
--start-time 2026-06-01T00:00:00Z \
--end-time 2026-06-15T00:00:00Z \
--period 3600 \
--statistics Sum
Output:
{
"Label": "Invocations",
"Datapoints": [
{ "Timestamp": "2026-06-13T14:00:00Z", "Sum": 184213.0, "Unit": "Count" },
{ "Timestamp": "2026-06-13T15:00:00Z", "Sum": 612944.0, "Unit": "Count" },
{ "Timestamp": "2026-06-13T16:00:00Z", "Sum": 9821.0, "Unit": "Count" }
]
}
A pattern that swings from 9,000 to 600,000 per hour is a textbook serverless fit. A flat line near the top of your concurrency every hour is a sign to price out containers.
Best practices
- Start with the traffic shape. Spiky and event-driven leans serverless; flat and high-volume leans containers or EC2.
- Estimate cost at projected peak volume, not at today’s trickle, using the AWS Pricing Calculator before you commit.
- Reserve serverless for jobs under 15 minutes; route longer work to AWS Batch, ECS, or Step Functions orchestrating smaller steps.
- Plan for cold starts if latency matters: keep functions small, use provisioned concurrency, or choose a faster runtime.
- Mix and match. Many strong systems use Lambda for glue and event handling while running steady core services on containers.
- Favor speed-to-market early. Ship on serverless to validate, then migrate hot paths to containers only when the numbers justify it.