Lifecycle Hooks & Warm Pools
When an Auto Scaling group (a service that adds or removes servers automatically based on load) launches a new instance, that instance often is not ready to serve traffic the instant it boots. It may need to install software, download config, or warm a cache first. Lifecycle hooks let you pause an instance at a key moment so you can run that setup or do a clean shutdown. Warm pools go a step further: they keep pre-initialized instances on standby so scale-out is almost instant. Both features help you avoid the classic problem of sending real users to a half-baked server.
What is a lifecycle hook?
A lifecycle hook is a checkpoint in the life of an Auto Scaling Group (ASG) instance. The ASG pauses the instance in a wait state and waits for you to say “I’m done.” There are two hook types:
- Launch hook — fires when a new instance starts, before it enters service. The instance sits in
Pending:Waitwhile you bootstrap it (install agents, register with a service, pull secrets). - Terminate hook — fires when an instance is about to be removed. The instance sits in
Terminating:Waitso you can drain connections, flush logs, or back up data before it disappears.
While paused, the instance does nothing on its own. You must call the complete-lifecycle-action API to release it. If you don’t, the hook eventually times out and the ASG takes the default result you configured — usually ABANDON (kill the instance) or CONTINUE (proceed anyway).
Gotcha: A forgotten
complete-lifecycle-actioncall is the #1 lifecycle hook bug. The instance stays inPending:Waitfor the full timeout (up to 48 hours if you raise it), so your scale-out appears frozen. Always pair every hook with code that signals completion, and set a sane timeout (300-900 seconds is typical).
When to use a lifecycle hook
Use a launch hook when an instance needs guaranteed setup before it takes traffic — for example, downloading a large model file or registering with a configuration server. Use a terminate hook when losing in-flight work is costly, such as draining a queue worker or uploading the last batch of metrics. Do not use a hook for setup that finishes in a few seconds — plain user-data scripts or a baked AMI (Amazon Machine Image, a server template) are simpler.
Creating a lifecycle hook
Console steps
- Open the EC2 console and go to Auto Scaling Groups.
- Select your group, then open the Instance management tab.
- Under Lifecycle hooks, choose Create lifecycle hook.
- Set Lifecycle transition to Instance launch or Instance terminate.
- Set Heartbeat timeout (e.g.
600seconds) and Default result (ABANDONorCONTINUE). - Choose Create.
CLI equivalent
aws autoscaling put-lifecycle-hook \
--auto-scaling-group-name web-asg \
--lifecycle-hook-name bootstrap-hook \
--lifecycle-transition autoscaling:EC2_INSTANCE_LAUNCHING \
--heartbeat-timeout 600 \
--default-result ABANDON
Output:
(no output on success; exit code 0)
When the instance enters the wait state, your bootstrap script (often triggered by an Amazon EventBridge rule or an in-instance agent) finishes the work and then signals completion:
aws autoscaling complete-lifecycle-action \
--auto-scaling-group-name web-asg \
--lifecycle-hook-name bootstrap-hook \
--lifecycle-action-result CONTINUE \
--instance-id i-0a1b2c3d4e5f
If something goes wrong and you need more time, send a heartbeat to reset the timer instead of letting it expire:
aws autoscaling record-lifecycle-action-heartbeat \
--auto-scaling-group-name web-asg \
--lifecycle-hook-name bootstrap-hook \
--instance-id i-0a1b2c3d4e5f
What is a warm pool?
A warm pool is a group of pre-initialized instances that the ASG keeps ready in advance. Instead of launching a brand-new instance from scratch during a traffic spike (which can take minutes once you add boot time, bootstrap, and health checks), the ASG pulls a ready instance out of the pool in seconds.
Warm pool instances can sit in one of three states:
| Pool state | What it means | Cost |
|---|---|---|
Stopped | Instance is fully prepared, then stopped. Cheapest — you pay only for EBS storage. | Lowest |
Running | Instance stays on, ready instantly. Fastest, but you pay full compute. | Highest |
Hibernated | RAM saved to disk; resumes warm caches in place. | Low-medium |
When demand rises, a pool instance is started (or resumed) and moved into the live ASG. Pair this with a launch lifecycle hook so the slow setup happens once, while in the pool, not on the hot path.
When to use a warm pool
Use a warm pool when your instances are slow to become ready (long boot, big downloads, JIT warm-up) and you face sudden, spiky traffic where every second of cold-start latency hurts. Do not bother if your instances boot and pass health checks in under a minute, or if your load grows smoothly — the standby cost won’t pay off.
Cost tip: A
Stoppedwarm pool costs only EBS storage — roughly $0.08/GB-month for gp3, so a 30 GB root volume is about $2.40/month per standby instance. That is far cheaper than running idle compute, and usually trivial next to the revenue lost to a slow scale-out.
Creating a warm pool (CLI)
aws autoscaling put-warm-pool \
--auto-scaling-group-name web-asg \
--pool-state Stopped \
--min-size 2 \
--max-group-prepared-capacity 10
Output:
(no output on success; exit code 0)
In the console: open the ASG, go to the Instance management tab, find Warm pool, choose Create warm pool, then set the pool state, minimum size, and prepared capacity.
A launch hook scoped to the warm pool transition (autoscaling:EC2_INSTANCE_LAUNCHING) runs while the instance is being prepared for the pool, so your heavy setup is done before it ever serves a request.
Best Practices
- Always call
complete-lifecycle-actionfrom your bootstrap/teardown logic, and add error handling so a script crash still signalsABANDONrather than hanging. - Keep heartbeat timeouts as short as the work allows; use
record-lifecycle-action-heartbeatfor legitimately long tasks instead of a huge timeout. - Bake as much as possible into a custom AMI so hooks and warm-pool prep do less work and run faster.
- Prefer
Stoppedwarm pools to control cost; switch toRunningorHibernatedonly when start time is still too slow. - Set
max-group-prepared-capacityso the pool plus live group never exceeds your ASG max size and budget. - Test the full flow in a staging ASG: trigger a scale-out, watch instances move from
Warmed:StoppedtoInService, and confirm health checks pass.