Blue/Green & Canary Deployments
Every time you ship new code you are taking a risk: the new version might be slower, buggy, or just broken. A deployment strategy is the plan for how you replace the old running code with the new code so that, if something goes wrong, your users barely notice. The three strategies you will meet on AWS are rolling, blue/green, and canary. This page explains what each one does, when to reach for it, and how to set it up on AWS with both the console and the CLI (Command Line Interface, the aws terminal tool).
The three strategies at a glance
Think of your application as running on a fleet of servers (or containers, or Lambda functions) behind a load balancer (a service that spreads incoming requests across many servers).
- Rolling / in-place updates the servers you already have, a few at a time, replacing old code with new.
- Blue/green stands up a second, complete copy of your environment with the new code, then flips all traffic to it at once.
- Canary sends a small slice of real traffic (say 10%) to the new version first, watches it, then shifts the rest.
| Strategy | Extra infra needed | Rollback speed | Downtime | Best for |
|---|---|---|---|---|
| Rolling / in-place | None (reuses servers) | Slow — must redeploy old code | Reduced capacity during deploy | Routine, low-risk changes; cost-sensitive workloads |
| Blue/green | Double (two full environments) | Instant — flip traffic back | Zero | High-stakes releases needing a clean, fast rollback |
| Canary | Slight (new version runs alongside) | Fast — stop the shift | Zero | Validating risky changes against real traffic gradually |
Rolling / in-place deployments
A rolling deployment takes your existing servers and updates them in batches. For example, with 10 servers it might update 2 at a time, waiting for each pair to pass health checks before moving on. No new infrastructure is created, so it is the cheapest option.
When to use this: routine releases, internal tools, or when doubling your infrastructure cost is not worth it. When NOT to: when you cannot tolerate reduced capacity mid-deploy, or when a fast, clean rollback is critical — rolling back means redeploying the old code, which is slow.
With AWS CodeDeploy (the AWS service that automates code rollouts), an EC2/on-premises deployment group defaults to an in-place strategy.
aws deploy create-deployment \
--application-name MyWebApp \
--deployment-group-name Prod-InPlace \
--deployment-config-name CodeDeployDefault.OneAtATime \
--revision "revisionType=S3,s3Location={bucket=my-artifacts,key=app-v2.zip,bundleType=zip}"
Output:
{
"deploymentId": "d-AB12CD34E"
}
The config CodeDeployDefault.OneAtATime updates one instance at a time. Others include HalfAtATime and AllAtOnce.
Blue/green deployments
In a blue/green deployment, blue is your current live environment and green is a brand-new, identical environment running the new code. You test green privately. When you are happy, you switch the load balancer so 100% of traffic goes to green in one move. Blue stays untouched, so rolling back is just flipping traffic back to blue — near-instant.
When to use this: big releases, framework upgrades, anything where you want a guaranteed, immediate escape hatch. When NOT to: when budget is tight (you pay for two full environments during the cutover) or your app has heavy stateful resources that cannot easily be duplicated.
Doing it with CodeDeploy and an Application Load Balancer
An Application Load Balancer (ALB) routes traffic to target groups (a target group is a set of servers the load balancer can send requests to). Blue/green works by giving the ALB a green target group and then swapping which group it serves.
Console steps:
- Open the CodeDeploy console and choose Create deployment group.
- Set Deployment type to Blue/green.
- Under Environment configuration, choose Automatically copy Amazon EC2 Auto Scaling group so AWS provisions the green fleet for you.
- Under Load balancer, select your ALB target group (e.g.
tg-blue). - Pick a Deployment settings option — e.g. Reroute traffic immediately for a clean cutover.
- Set Original instances to Terminate after a wait time (e.g. 1 hour) so you keep blue around for rollback, then create the group.
CLI equivalent (create the deployment against a blue/green group):
aws deploy create-deployment \
--application-name MyWebApp \
--deployment-group-name Prod-BlueGreen \
--revision "revisionType=S3,s3Location={bucket=my-artifacts,key=app-v2.zip,bundleType=zip}"
Cost warning: during the cutover window you are paying for both the blue and green fleets. If each environment is 10
m5.largeinstances (~$0.096/hr each), an extra fleet costs roughly $0.96/hr, or about $23/day if you forget to terminate blue. Always set an auto-termination time.
Canary deployments
A canary deployment (named after the “canary in a coal mine”) shifts a small percentage of traffic to the new version, pauses to watch metrics like error rate and latency, then shifts the rest. If the canary looks unhealthy, you stop and roll back having only exposed a fraction of users.
When to use this: validating genuinely risky changes against live traffic without betting everything at once. When NOT to: very small fleets where “10%” rounds to one server anyway, or changes that must be all-or-nothing (e.g. a breaking API contract).
Canary with Lambda aliases
For AWS Lambda (serverless functions), a function alias (a named pointer to a version) can split traffic between two versions using weights.
# Publish the new version, then point the alias at v2 but send only 10% there
aws lambda update-alias \
--function-name checkout \
--name prod \
--function-version 2 \
--routing-config "AdditionalVersionWeights={1=0.9}"
Output:
{
"AliasArn": "arn:aws:lambda:us-east-1:111122223333:function:checkout:prod",
"FunctionVersion": "2",
"RoutingConfig": {
"AdditionalVersionWeights": { "1": 0.9 }
}
}
Here version 1 keeps 90% and version 2 (the canary) gets 10%. CodeDeploy can automate this with built-in configs like CodeDeployDefault.LambdaCanary10Percent5Minutes (10% for 5 minutes, then 100%).
Canary with ALB weighted target groups
For containers or EC2, an ALB listener rule can forward to two target groups with weights — tg-blue weighted 90 and tg-green weighted 10 — to mimic a canary at the load-balancer level.
The database gotcha
Blue/green and canary make rollback feel free — just flip traffic. But this only rolls back code. It does not roll back a database schema change. If your green deploy ran a migration that dropped a column, switching traffic back to blue points old code at a database that no longer matches it, and you get errors anyway.
Gotcha: decouple database migrations from the traffic cutover. Make schema changes backward-compatible (additive only — add columns, never drop in the same release) and deploy them before the code that needs them. Remove old columns in a later, separate release once no version depends on them. This “expand and contract” pattern keeps both blue and green able to run against the same database.
Best practices
- Default to rolling for everyday changes; reserve blue/green and canary for higher-risk releases.
- Always wire health checks and CloudWatch alarms into CodeDeploy so a failing deploy auto-rolls-back instead of needing a human.
- For blue/green, set an auto-termination timer on the old fleet — long enough to roll back, short enough to avoid paying for two environments indefinitely.
- Make every database migration backward-compatible and ship it separately from the code cutover (expand-and-contract).
- Be careful with stateful resources (in-memory sessions, local disk) — store session state externally (e.g. in a cache) so traffic can move freely.
- Start canaries small (5–10%) and bake for several minutes while watching error rate and latency before shifting the rest.