Cost-Aware Architecture
Most cloud cost problems are not caused by forgetting to turn something off. They are baked into the architecture on day one. The shape of your design decides how the bill grows: a chatty, always-on, data-transfer-heavy system stays expensive no matter how hard you tune it later, while a design that matches spending to actual load can be cheap by default. Cost-aware architecture means treating money as a first-class design constraint, right next to performance and reliability, before you write any code.
Why architecture locks in cost
When you pick a compute model, a storage class, or where data flows, you are signing up for a long-term spending pattern. You can right-size an instance in five minutes, but you cannot easily move a monolith that sends terabytes between Availability Zones (an AWS data center group inside a Region) without a rewrite. The expensive mistakes are structural, not operational.
The core idea is simple: match the spend curve to the load curve. If load is spiky or unpredictable, pay per use. If load is a steady baseline, buy a commitment for a discount. Mixing these correctly is most of cost optimization.
Match the compute model to the load shape
The biggest lever is choosing how you run code. Pay-per-use options cost nothing when idle. Provisioned options cost money 24/7 but get cheaper per hour when you commit.
| Load shape | Best fit | Why |
|---|---|---|
| Spiky / unpredictable / event-driven | AWS Lambda (run code without servers), Fargate (serverless containers) | You pay only while requests run; zero cost at idle |
| Steady 24/7 baseline | EC2 with a Savings Plan or Reserved Instance | Up to ~72% off on-demand for a 1-3 year commitment |
| Fault-tolerant batch / CI / rendering | EC2 Spot Instances | Up to ~90% off, but AWS can reclaim with 2 minutes’ notice |
| Mostly idle web app | Lambda or App Runner | No paying for empty servers overnight |
When NOT to use serverless: very high, constant traffic (think millions of steady requests per second) can cost more on Lambda than on committed EC2. Run the numbers — pay-per-use wins for variable load, commitments win for predictable baseline.
Right-size and use Graviton
Graviton is AWS’s own Arm-based processor. Graviton instances are typically ~20% cheaper and ~40% better price-performance than comparable Intel/AMD instances, and most modern runtimes (Java, Node.js, Python, Go) run on Arm with no code change. Right-sizing means matching instance size to real usage instead of guessing high.
Find under-used instances with AWS Compute Optimizer.
aws compute-optimizer get-ec2-instance-recommendations \
--instance-arns arn:aws:ec2:us-east-1:111122223333:instance/i-0a1b2c3d4e5f
Output:
{
"instanceRecommendations": [
{
"instanceArn": "arn:aws:ec2:us-east-1:111122223333:instance/i-0a1b2c3d4e5f",
"currentInstanceType": "m5.2xlarge",
"finding": "OVER_PROVISIONED",
"recommendationOptions": [
{ "instanceType": "m7g.large", "rank": 1, "savingsOpportunityPercentage": 64.2 }
]
}
]
}
Here m7g.large is a Graviton instance — smaller and cheaper, with 64% savings.
Minimize data transfer
Moving data is one of the most underestimated costs. Inside one AZ, transfer is free. But cross-AZ traffic costs about $0.01/GB each way, and data leaving AWS to the internet (egress) costs around $0.09/GB. A busy microservice mesh that ignores AZ placement can run up thousands of dollars a month in transfer alone.
Rules of thumb:
- Keep tightly chatting services in the same AZ where high availability allows.
- Put a VPC Gateway Endpoint in front of S3 and DynamoDB — traffic to them then skips the NAT Gateway, saving both the $0.045/GB NAT processing fee and egress.
- Serve public content through CloudFront (AWS’s CDN); CloudFront egress is cheaper than direct S3/EC2 egress and caches reduce origin traffic.
aws ec2 create-vpc-endpoint \
--vpc-id vpc-0a1b2c3d \
--service-name com.amazonaws.us-east-1.s3 \
--route-table-ids rtb-0a1b2c3d
Output:
{
"VpcEndpoint": {
"VpcEndpointId": "vpce-0a1b2c3d",
"VpcEndpointType": "Gateway",
"ServiceName": "com.amazonaws.us-east-1.s3",
"State": "available"
}
}
Gotcha: Gateway Endpoints for S3 and DynamoDB are completely free and save NAT Gateway fees. Not adding them is leaving money on the table for almost no effort.
Pick the right storage tier
Storage is rarely “one size.” Match the access pattern to the class.
| Need | S3 storage class | Note |
|---|---|---|
| Frequent access, unknown pattern | S3 Intelligent-Tiering | Auto-moves objects between tiers; safest default |
| Rarely accessed but needed fast | S3 Standard-IA | Cheaper storage, small retrieval fee |
| Archive, minutes-to-hours retrieval | S3 Glacier Flexible Retrieval | Very cheap long-term |
| Deep archive, 12+ hour retrieval | S3 Glacier Deep Archive | Cheapest of all |
A lifecycle rule moves old objects to cheaper tiers automatically.
{
"Rules": [{
"ID": "archive-old-logs",
"Filter": { "Prefix": "logs/" },
"Status": "Enabled",
"Transitions": [
{ "Days": 30, "StorageClass": "STANDARD_IA" },
{ "Days": 90, "StorageClass": "GLACIER" }
]
}]
}
aws s3api put-bucket-lifecycle-configuration \
--bucket my-app-logs --lifecycle-configuration file://lifecycle.json
Tag everything for visibility
You cannot optimize what you cannot see. Cost allocation tags (key-value labels on resources) let AWS Cost Explorer break the bill down by team, environment, or feature.
Console steps to activate cost tags:
- Open the Billing and Cost Management console.
- In the left menu choose Cost allocation tags.
- Find your tag key (for example
Environment) under User-defined cost allocation tags. - Select it and choose Activate. It becomes a filter in Cost Explorer within ~24 hours.
CLI equivalent — tag a resource and activate the key:
aws ec2 create-tags --resources i-0a1b2c3d4e5f \
--tags Key=Environment,Value=prod Key=Team,Value=payments
aws ce update-cost-allocation-tags-status \
--cost-allocation-tags-status TagKey=Environment,Status=Active
Output:
{
"Errors": []
}
Best Practices
- Decide the compute model from the load shape first: pay-per-use for variable traffic, Savings Plans for steady baseline, Spot for interruptible batch.
- Default new compute to Graviton (Arm) unless a dependency blocks it.
- Treat data transfer as a design constraint — co-locate chatty services and add free S3/DynamoDB Gateway Endpoints.
- Set S3 lifecycle rules on day one so old data drifts to cheaper tiers automatically.
- Tag every resource with
EnvironmentandTeam, and activate those tags for Cost Explorer. - Run AWS Compute Optimizer and Cost Explorer monthly; act on right-sizing findings.
- Model the cost of two or three candidate designs before building — a structural fix later is far more expensive than choosing well now.