Navigation

AWS aws architecture 6 min read

The Well-Architected Framework

The AWS Well-Architected Framework is a set of questions and best practices AWS publishes to help you judge whether a cloud design is sound. It is not a product you deploy. It is a structured way of thinking, organised into six “pillars”, that surfaces the tradeoffs hiding inside your architecture before they turn into a 3 a.m. outage or a surprise bill. The most important idea on this whole page: the framework is about making tradeoffs on purpose, not about scoring full marks on every pillar.

Why a framework at all

When you build on AWS, every choice quietly trades one good thing for another. A single small server is cheap but fragile. Spreading copies across three data centres is reliable but costs more. Encrypting and logging everything is secure but slower to operate. Left implicit, these tradeoffs get made by accident. The Well-Architected Framework gives you a shared checklist so a team can say out loud, “we chose lower cost here, and we accept that this service can be down for an hour.”

The single biggest misuse of this framework is treating it like an exam to ace. Optimising cost can hurt reliability; optimising reliability raises cost. A “perfect” architecture in all six pillars at once usually does not exist. Use the framework to find the tradeoffs you are already making without realising it.

The six pillars

Pillar	Plain-English question it asks	What it pushes you toward	What it can cost you
Operational excellence	Can we run, monitor and change this safely?	Automation, runbooks, observability	Time spent building tooling
Security	Who can touch what, and is data protected?	Least-privilege access, encryption, audit logs	Some speed and convenience
Reliability	Does it recover from failure on its own?	Redundancy, backups, health checks	Higher infrastructure cost
Performance efficiency	Are we using the right resources, sized right?	Modern instance types, serverless, caching	Engineering effort to right-size
Cost optimization	Are we paying only for what we need?	Smaller resources, Savings Plans, shutdowns	Reduced headroom and redundancy
Sustainability	Are we minimising energy and waste?	Efficient regions, less idle capacity	Sometimes conflicts with raw performance

A short tour of each:

Operational excellence

This pillar is about running the system day to day. It asks whether you deploy with automation (not manual clicking), whether you can see what is happening through metrics and logs, and whether you learn from failures. Think Infrastructure as Code (defining your servers in a file instead of by hand) and dashboards in Amazon CloudWatch (AWS’s monitoring service).

Security

This pillar covers identity, data protection and detection. The core practice is least privilege: every user and service gets the minimum permissions it needs, granted through AWS Identity and Access Management (IAM, the service that controls who can do what). It also covers encrypting data and keeping audit trails with AWS CloudTrail (a service that records every API call in your account).

Reliability

This pillar is about recovering from failure and meeting demand. It pushes you to run across multiple Availability Zones (isolated data centres within a region), to take backups, and to use health checks so traffic routes away from broken instances automatically.

Performance efficiency

This pillar asks whether you picked the right tool and sized it correctly. Are you on a current-generation instance type? Could a managed or serverless service do the job with less waste? Are you caching repeated work?

Cost optimization

This pillar asks whether you are paying only for value received. It covers turning off idle resources, choosing Savings Plans (a discount for committing to steady usage), and matching capacity to real demand.

Sustainability

Added in 2021, this pillar asks you to reduce the energy and hardware your workload consumes. Practices include choosing efficient regions, removing idle resources, and using managed services that pack many customers onto shared hardware.

How the pillars pull against each other

A concrete example. Suppose a small API runs on one t3.medium instance (i-0a1b2c3d4e5f) in a single subnet (subnet-0a1b2c3d).

Cost optimization is happy: one instance is about $30/month on demand.
Reliability is unhappy: if that Availability Zone fails, the whole API is down.

To satisfy reliability you add a second instance in another zone behind a load balancer. Now reliability improves, but your compute cost roughly doubles to around $60/month, plus about $16/month for the load balancer. There is no free win here — you traded money for resilience. The framework’s job is to make that trade visible and deliberate, not to tell you redundancy is always “correct”.

The Well-Architected Tool

AWS provides a free service called the AWS Well-Architected Tool that walks you through the pillar questions and records your answers as a “workload review”. It flags risks as High Risk Issues (HRIs) or Medium Risk Issues and lets you track improvements over time.

When to use it: before a major launch, after an incident, or on a recurring schedule (for example quarterly) for important workloads. When not to bother: for a throwaway prototype or a personal side project — the overhead outweighs the benefit.

Console steps

Open the AWS Management Console and go to the AWS Well-Architected Tool.
Choose Define workload, give it a name, environment (Production or Pre-production), and the regions it runs in.
Select the AWS Well-Architected Framework lens (a “lens” is a question set; there are also specialised lenses like Serverless).
Choose Start reviewing and answer the questions pillar by pillar. Mark items as “None of these” honestly where they do not apply.
Open the Improvement plan tab to see prioritised High and Medium Risk Issues with links to guidance.

CLI equivalent

aws wellarchitected create-workload \
  --workload-name "checkout-api-prod" \
  --description "Customer checkout API" \
  --environment PRODUCTION \
  --aws-regions us-east-1 \
  --lenses wellarchitected \
  --review-owner "[email protected]"

Output:

{
    "WorkloadId": "a1b2c3d4e5f60718293a4b5c6d7e8f90",
    "WorkloadArn": "arn:aws:wellarchitected:us-east-1:123456789012:workload/a1b2c3d4e5f60718293a4b5c6d7e8f90"
}

List the risks the review found:

aws wellarchitected list-lens-review-improvements \
  --workload-id a1b2c3d4e5f60718293a4b5c6d7e8f90 \
  --lens-alias wellarchitected