Skip to content
AWS aws load-balancing 6 min read

Scaling Policies: Target Tracking, Step & Scheduled

An Auto Scaling Group (ASG — a group of EC2 instances that AWS keeps the right number of running for you) needs rules that tell it when to add or remove instances. Those rules are called scaling policies. Picking the right policy is the difference between an app that smoothly absorbs traffic and one that falls over at the worst moment (or one that burns money running idle servers). This page compares the main policy types in plain English and shows you how to set each one up in both the AWS Management Console and the AWS Command Line Interface (CLI — the terminal tool for controlling AWS).

The four policy types at a glance

There are four ways an ASG can decide to change its instance count. The first three react to metrics (numbers like CPU usage); the last reacts to the clock.

Policy typeHow it decidesBest forSkip it when
Target trackingYou set a target value for one metric (e.g. “keep CPU at 50%”) and AWS does the mathAlmost everything — this is the default choiceYour load doesn’t map cleanly to a single metric
Step scalingYou define thresholds with different-sized steps (e.g. CPU > 70% adds 1, CPU > 90% adds 3)Cases where you need fine control over how aggressively you scaleTarget tracking already does the job (it usually does)
Simple scalingOne threshold, one action, then a cooldown waitLegacy setups only — AWS recommends against it nowAlways, in practice — use step or target tracking
Scheduled scalingA time-based rule that changes capacity at set timesPredictable, recurring traffic (business hours, nightly batch)Traffic is unpredictable

Recommendation: Start with target tracking for every new app. It is the simplest to reason about and AWS tunes the underlying alarms for you. Reach for the others only when you have a specific reason.

Target tracking scaling

Target tracking works like the thermostat in your house. You say “keep it at 22 degrees” and the system heats or cools as needed. With AWS you say “keep average CPU at 50%” and the ASG adds instances when CPU climbs above that and removes them when it drops below.

When to use this: Use it as your default for web servers, APIs, and most stateless workloads. When NOT to: avoid it if your bottleneck isn’t captured by a single metric, or if scaling out doesn’t actually relieve the metric you picked.

Common target metrics:

  • ASGAverageCPUUtilization — average CPU across the group.
  • ALBRequestCountPerTarget — requests per instance behind an Application Load Balancer (ALB — the AWS layer-7 load balancer). Often better than CPU for web traffic.
  • ASGAverageNetworkIn / ASGAverageNetworkOut — bytes in/out.

Console steps

  1. Open the EC2 console, go to Auto Scaling Groups, and select your group.
  2. Open the Automatic scaling tab and choose Create dynamic scaling policy.
  3. Set Policy type to Target tracking scaling.
  4. Pick a Metric type (e.g. Average CPU utilization) and a Target value (e.g. 50).
  5. Leave Disable scale in unchecked so the group can shrink too. Click Create.

CLI command

aws autoscaling put-scaling-policy \
  --auto-scaling-group-name web-asg \
  --policy-name cpu-target-50 \
  --policy-type TargetTrackingScaling \
  --target-tracking-configuration '{
    "PredefinedMetricSpecification": {"PredefinedMetricType": "ASGAverageCPUUtilization"},
    "TargetValue": 50.0
  }'

Output:

{
    "PolicyARN": "arn:aws:autoscaling:us-east-1:111122223333:scalingPolicy:a1b2c3d4-...:autoScalingGroupName/web-asg:policyName/cpu-target-50",
    "Alarms": [
        {"AlarmName": "TargetTracking-web-asg-AlarmHigh-...", "AlarmARN": "arn:aws:cloudwatch:..."},
        {"AlarmName": "TargetTracking-web-asg-AlarmLow-...", "AlarmARN": "arn:aws:cloudwatch:..."}
    ]
}

Notice AWS created the CloudWatch alarms (the monitoring rules) for you — that is the big convenience of target tracking.

Step and simple scaling

Step scaling lets you say “do different things at different severity levels.” For example: add 1 instance when CPU is 70-85%, but add 3 instances above 85%. Simple scaling is the older, weaker version: one threshold, one action, then a forced cooldown before it can act again.

When to use step scaling: when a single proportional target isn’t enough and you want bigger jumps during severe spikes. When NOT to: for ordinary workloads — target tracking is less work and behaves well. Avoid simple scaling entirely in new designs; it cannot react during its cooldown, which makes it slow.

You attach step scaling to a CloudWatch alarm you create yourself.

aws autoscaling put-scaling-policy \
  --auto-scaling-group-name web-asg \
  --policy-name cpu-step-out \
  --policy-type StepScaling \
  --adjustment-type ChangeInCapacity \
  --metric-aggregation-type Average \
  --step-adjustments \
    MetricIntervalLowerBound=0,MetricIntervalUpperBound=15,ScalingAdjustment=1 \
    MetricIntervalLowerBound=15,ScalingAdjustment=3

Output:

{
    "PolicyARN": "arn:aws:autoscaling:us-east-1:111122223333:scalingPolicy:f9e8d7c6-...:autoScalingGroupName/web-asg:policyName/cpu-step-out"
}

Here the bounds are relative to the alarm threshold. If your alarm fires at 70% CPU, the first step (0-15 over threshold, i.e. 70-85%) adds 1 instance; above 85% it adds 3.

Scheduled scaling

Scheduled scaling changes capacity at fixed times, regardless of current metrics. It is the answer to predictable demand: an internal tool that’s busy 9am-6pm on weekdays, a storefront that spikes every day at noon, or a nightly report job.

When to use this: any traffic pattern you can predict by the clock. When NOT to: truly random load — there’s nothing to schedule.

Console steps

  1. In your Auto Scaling Group, open the Automatic scaling tab.
  2. Under Scheduled actions, choose Create scheduled action.
  3. Give it a name, set Min, Max, and Desired capacity.
  4. Set Recurrence with a cron expression (e.g. 0 8 * * MON-FRI) and a time zone. Click Create.

CLI command

aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name web-asg \
  --scheduled-action-name scale-up-business-hours \
  --recurrence "0 8 * * MON-FRI" \
  --time-zone "America/New_York" \
  --min-size 4 --max-size 20 --desired-capacity 6

Output:

(no output on success — run describe-scheduled-actions to confirm)

The key gotcha: combine, don’t choose

The most common mistake is treating these as either/or. They layer together. Pure reactive scaling (target tracking or step) only acts after a metric rises, so during a sharp, known spike your users feel slow responses while new instances boot up (which takes a couple of minutes).

Gotcha: Use target tracking as your always-on baseline, then add a scheduled action that raises desired-capacity a few minutes before a known spike (like the start of business hours). You scale out ahead of demand instead of chasing latency after it has already climbed. Target tracking then handles the surprises within the day.

A concrete cost note: scheduled scaling lets you set a lower baseline overnight. Dropping from 6 to 2 m5.large instances (about $0.096/hour each on-demand, us-east-1) for 12 hours saves roughly 4 × $0.096 × 12 ≈ $4.60 per day, or about $138/month — for one action.

Best practices

  • Make target tracking your default; reach for step scaling only when you need uneven, severity-based jumps.
  • Never rely on simple scaling for new workloads — its cooldown makes it react too slowly.
  • Layer a scheduled action ahead of known spikes so capacity is ready before demand, not after.
  • Set a sensible min size so you always have headroom for the boot time of new instances.
  • Prefer ALBRequestCountPerTarget over CPU for web tiers — request count tracks user demand more directly.
  • Keep scale-in enabled (don’t disable it) so you actually reap the cost savings when load drops.
  • Test scale-out under a realistic load test, and confirm new instances pass health checks before they receive traffic.
Last updated June 15, 2026
Was this helpful?