SQL vs NoSQL on AWS
One of the first big decisions in any cloud project is which kind of database to use. The two broad families are SQL (relational) and NoSQL (non-relational). On AWS this usually means choosing between Amazon RDS or Aurora on the SQL side, and Amazon DynamoDB on the NoSQL side. This page explains how they differ, which use cases fit each one, and the one rule that should drive your choice: pick based on your access patterns and scale needs, not on hype.
What “SQL” and “NoSQL” actually mean
A SQL database (SQL stands for Structured Query Language) stores data in tables with rows and columns, and a fixed schema (a defined shape for your data). Tables can be linked together with relationships, and you query them using flexible SQL: filters, sorts, aggregations, and joins (combining rows from two or more tables). Examples on AWS are Amazon RDS (Relational Database Service, a managed service for engines like PostgreSQL and MySQL) and Amazon Aurora (AWS’s high-performance, MySQL/PostgreSQL-compatible engine).
A NoSQL database stores data in a more flexible format and is built for predictable performance at huge scale. AWS’s flagship NoSQL service is Amazon DynamoDB, a key-value and document database. There is no fixed schema across items, no joins, and queries must follow patterns you designed in advance.
The core trade-off: SQL gives you query flexibility, NoSQL gives you scale predictability.
How they differ
| Aspect | SQL (RDS / Aurora) | NoSQL (DynamoDB) |
|---|---|---|
| Data model | Tables with a fixed schema | Items in tables, schema-flexible |
| Relationships | Joins across tables | Denormalized; no joins |
| Querying | Flexible, ad-hoc SQL | Only by designed keys/indexes |
| Transactions | Strong, multi-row ACID | Supported but more limited |
| Scaling | Mostly vertical (bigger instance) | Horizontal, near-limitless |
| Performance at scale | Degrades as data grows | Single-digit ms at any size |
| Pricing model | Pay for instance hours | Pay per request or capacity |
ACID stands for Atomicity, Consistency, Isolation, Durability — the guarantees that keep multi-step changes (like a bank transfer) correct.
When to use SQL
Choose a relational database when:
- You need joins or ad-hoc reporting (questions you haven’t thought of yet).
- You don’t fully know your access patterns up front.
- You need strong multi-row transactions (orders, payments, inventory).
- Your data is naturally relational (customers, orders, products linked together).
- Your scale is moderate — gigabytes to a few terabytes, thousands of requests per second.
Relational is the safer default. When query patterns are unknown or you need flexibility, start with RDS or Aurora. You can always evolve later, and SQL forgives design mistakes far more easily than DynamoDB does.
When NOT to use SQL: avoid it when you truly need internet-scale write throughput (millions of requests per second) with flat, predictable latency, because scaling a single relational instance hits a ceiling.
Creating a relational database
AWS Management Console:
- Open the RDS console and choose Databases > Create database.
- Pick Standard create, then choose an engine such as Aurora (PostgreSQL Compatible) or PostgreSQL.
- Set a DB instance identifier, master username, and password.
- Choose an instance class (for example
db.t4g.medium) and your VPC. - Click Create database.
AWS CLI (v2):
aws rds create-db-instance \
--db-instance-identifier shop-prod \
--engine postgres \
--db-instance-class db.t4g.medium \
--allocated-storage 50 \
--master-username admin \
--master-user-password 'ChangeMe123!' \
--vpc-security-group-ids sg-0a1b2c3d \
--db-subnet-group-name prod-subnets
Output:
{
"DBInstance": {
"DBInstanceIdentifier": "shop-prod",
"DBInstanceClass": "db.t4g.medium",
"Engine": "postgres",
"DBInstanceStatus": "creating",
"AllocatedStorage": 50
}
}
A db.t4g.medium runs roughly $0.065/hour on-demand (about $47/month), plus storage at about $0.115/GB-month for gp3. Multi-AZ doubles the instance cost for high availability.
When to use NoSQL
Choose DynamoDB when:
- You know your access patterns up front (e.g. “get a user’s orders by user ID”).
- You need limitless, predictable scale with single-digit-millisecond latency.
- Your workload is high-volume key lookups (user sessions, shopping carts, IoT events, gaming leaderboards).
- You want a fully serverless database with no instances to size or patch.
When NOT to use NoSQL: avoid DynamoDB when you need ad-hoc queries, joins, or complex reporting. If a stakeholder will ask new questions of the data later, DynamoDB will fight you — you’d have to redesign tables or copy data into a system like Amazon Redshift or Athena.
Model access patterns first. In DynamoDB you design the table around the exact queries your app will run. Get the partition key wrong and a query that should cost one request becomes a full-table scan — slow and expensive.
Creating a DynamoDB table
AWS Management Console:
- Open the DynamoDB console and choose Tables > Create table.
- Enter a table name and a Partition key (e.g.
userId). - Optionally add a Sort key (e.g.
orderId) to store many related items per key. - Choose On-demand capacity so you pay per request.
- Click Create table.
AWS CLI (v2):
aws dynamodb create-table \
--table-name Orders \
--attribute-definitions \
AttributeName=userId,AttributeType=S \
AttributeName=orderId,AttributeType=S \
--key-schema \
AttributeName=userId,KeyType=HASH \
AttributeName=orderId,KeyType=RANGE \
--billing-mode PAY_PER_REQUEST
Output:
{
"TableDescription": {
"TableName": "Orders",
"TableStatus": "CREATING",
"BillingModeSummary": {
"BillingMode": "PAY_PER_REQUEST"
}
}
}
On-demand DynamoDB costs about $1.25 per million writes and $0.25 per million reads, so a low-traffic table can cost cents per month and scale to massive volume without re-provisioning.
Decide by access pattern, not hype
The single most useful question is: do I know exactly how this data will be read and written?
- If the answer is “no” or “it will change” → use SQL (RDS/Aurora). Flexibility wins.
- If the answer is “yes, and I need it to scale forever” → use DynamoDB. Predictability wins.
Many real systems use both: a relational database for core transactional and reporting data, and DynamoDB for a specific high-throughput feature like session storage or a feed.
Best practices
- Default to relational (RDS/Aurora) when query patterns are uncertain or you need joins and reporting.
- Choose DynamoDB only after you have listed and modeled every access pattern.
- Use Aurora when you want PostgreSQL/MySQL compatibility with better performance and easier scaling than plain RDS.
- For DynamoDB, design the partition key to spread load evenly and avoid hot keys.
- Don’t force relational data into DynamoDB to “be modern” — denormalization without clear access patterns causes pain.
- For analytics on either store, move data into Amazon Redshift rather than running heavy reports on your operational database.
- Enable backups and Multi-AZ (or DynamoDB point-in-time recovery) before going to production.