Interview Questions: Serverless, Security & Architecture
This page covers the questions where interviewers stop checking facts and start checking judgment. Serverless, messaging, encryption, and architecture rounds are won by candidates who explain tradeoffs: when serverless fits and when it doesn’t, which messaging service matches the problem, and how recovery goals drive disaster-recovery design. Below are model answers with the reasoning that earns the strong-hire signal.
The Lambda model and cold starts
Question: How does AWS Lambda work, and what is a cold start?
Lambda is AWS’s serverless compute service — you upload a function, and AWS runs it on demand without you managing any servers. You pay only for the time your code actually runs, billed per millisecond, plus a tiny per-request fee. When idle, it costs nothing.
A cold start is the delay the first time a function runs after being idle (or when scaling up to handle more requests at once). AWS has to download your code, start a new execution environment, and initialize your runtime before your handler runs. Subsequent calls reuse that warm environment and are fast.
| Limit | Value (2026) | Why it matters in an answer |
|---|---|---|
| Max runtime | 15 minutes | Long jobs need ECS/EC2 instead |
| Memory | 128 MB to 10,240 MB | CPU scales with memory; more memory can run cheaper if it finishes faster |
| Deployment package | 250 MB unzipped (50 MB zipped) | Big dependencies push you to container images (up to 10 GB) |
| Concurrent executions | 1,000 per account by default (soft limit) | Spiky traffic can hit this; request an increase |
Model answer on cold starts: keep packages small, choose a fast-starting runtime, and for latency-sensitive APIs use provisioned concurrency (pre-warmed environments). Mention that putting Lambda inside a VPC used to add seconds of cold-start latency, but AWS fixed that with shared network interfaces — a detail that signals you are current.
Tip: Adding memory to a Lambda also adds CPU. A function that runs at 128 MB for 5 seconds may run at 512 MB for 0.8 seconds — and cost less overall while feeling faster. Always tune memory by measuring, not guessing.
When serverless fits — and when it doesn’t
Question: When would you choose serverless over containers or EC2?
This is a tradeoff question, so reason out loud. Lambda shines for event-driven, spiky, short-lived work: API backends, file-processing triggers, scheduled jobs, and glue between services. It is a poor fit for steady high-throughput workloads (you pay per invocation, which gets expensive), tasks over 15 minutes, latency-critical paths that can’t tolerate cold starts, or anything needing special OS access or a GPU.
The crisp framing interviewers reward: serverless trades control and steady-state cost for zero operations and automatic scaling. A quiet API hit a few hundred times a day costs cents a month on Lambda; the same traffic on an always-on t3.micro costs roughly $7.50/month. But at millions of sustained requests, a right-sized container fleet is usually cheaper.
SQS vs SNS vs EventBridge
Question: How do you choose between SQS, SNS, and EventBridge?
All three decouple services, but they solve different shapes of problem. Define each: SQS (Simple Queue Service) is a queue — one consumer group pulls and processes messages. SNS (Simple Notification Service) is pub/sub — one message is pushed to many subscribers (fan-out). EventBridge is an event bus that routes events based on their content to many targets.
| Service | Pattern | Best for | Watch out for |
|---|---|---|---|
| SQS | Queue (pull) | Buffering work, smoothing spikes, retry with a dead-letter queue | One logical consumer; no fan-out by itself |
| SNS | Pub/sub (push) | Fan-out the same message to many subscribers | No buffering — slow subscribers can drop |
| EventBridge | Event bus (routing) | Routing by content, SaaS/AWS-service events, schemas | Slightly higher latency than SNS |
Model answer: use SQS when you need to buffer and reliably process work one item at a time with retries. Use SNS when one event must notify many systems at once. Use EventBridge when you need rich routing rules (e.g. “send only order.cancelled events to this Lambda”) or you’re reacting to AWS or third-party SaaS events. A senior touch: the common production pattern is SNS fan-out to multiple SQS queues, so each consumer gets its own buffered, independently retried copy.
aws sqs create-queue \
--queue-name orders-queue \
--attributes '{"VisibilityTimeout":"60","MessageRetentionPeriod":"345600"}'
Output:
{
"QueueUrl": "https://sqs.us-east-1.amazonaws.com/111122223333/orders-queue"
}
KMS and Secrets Manager
Question: How do you handle encryption keys and application secrets?
Two different jobs. KMS (Key Management Service) manages encryption keys and performs encrypt/decrypt operations; most AWS services integrate with it to encrypt data at rest with one checkbox. Secrets Manager stores secrets like database passwords and API keys, and can automatically rotate them on a schedule.
The interview discriminator: KMS protects keys and data; Secrets Manager protects credentials and adds rotation. People often misuse Parameter Store (part of Systems Manager) here — it’s fine for plain config and cheaper, but Secrets Manager’s built-in rotation is the reason to pay extra (about $0.40 per secret per month).
Encrypt an S3 bucket with KMS — console steps:
- Open the S3 console and select your bucket.
- Go to Properties > Default encryption > Edit.
- Choose SSE-KMS and pick an existing KMS key (or create one).
- Save changes.
CLI equivalent:
aws s3api put-bucket-encryption \
--bucket my-app-data \
--server-side-encryption-configuration '{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"aws:kms","KMSMasterKeyID":"arn:aws:kms:us-east-1:111122223333:key/0a1b2c3d-4e5f-6789-abcd-ef0123456789"}}}]}'
Output:
(no output on success; verify with get-bucket-encryption)
Least privilege and IAM
Question: What does least privilege mean, and how do you achieve it?
Least privilege means giving each identity only the permissions it needs and nothing more. IAM (Identity and Access Management) is how you do it. The strong answer covers the mechanics:
- Attach policies to roles, not to individual users; have services and people assume roles for temporary credentials.
- Grant EC2 and Lambda permissions through IAM roles, never hardcoded access keys.
- Start from
"Effect": "Deny"by default (IAM denies unless allowed) and add narrowAllowstatements scoped to specific resource ARNs. - Use IAM Access Analyzer to spot overly broad or unused permissions.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject"],
"Resource": "arn:aws:s3:::my-app-data/uploads/*"
}
]
}
Warning: never attach
AdministratorAccessors3:*onResource: "*"to a production workload “to make it work.” That single shortcut is the root cause of a large share of real-world breaches. Scope actions and resources tightly, then widen only when something genuinely fails.
Designing HA, scalable, and DR architectures
Question: How do you design a system to survive failures?
Frame the answer around two goals interviewers listen for. RTO (Recovery Time Objective) is how long you can be down. RPO (Recovery Point Objective) is how much data you can afford to lose. These numbers drive every DR (disaster recovery) decision and cost.
| DR strategy | RTO / RPO | Cost | When to use |
|---|---|---|---|
| Backup & restore | Hours / hours | Lowest | Non-critical apps; tight budgets |
| Pilot light | Tens of minutes / minutes | Low | Core systems kept minimal in a second Region |
| Warm standby | Minutes / seconds | Medium | Important apps needing fast recovery |
| Multi-Region active-active | Near zero / near zero | Highest | Mission-critical, no tolerable downtime |
For high availability within a Region, the rule is: remove single points of failure. Spread across at least two Availability Zones (isolated data centers), front compute with a load balancer, use an Auto Scaling group so failed instances are replaced, and run databases in Multi-AZ so a standby takes over automatically.
Open-ended system-design prompt: design a scalable image-upload and processing service.
A model walkthrough: users upload through CloudFront (a global content delivery network) and API Gateway to an S3 bucket. The upload event triggers a Lambda that creates thumbnails. To absorb spikes, the event goes to an SQS queue with a dead-letter queue for failures, so no upload is lost if processing errors out. Metadata lives in DynamoDB (a managed NoSQL database that scales horizontally). For HA, everything is multi-AZ and serverless services are Regional by default. For DR, enable S3 Cross-Region Replication and DynamoDB global tables if the RPO requires it. Call out the single points of failure you removed at each step — that narration is what wins the round.
Best Practices
- Right-size Lambda memory by measuring; more memory often means lower total cost and latency.
- Use provisioned concurrency only for latency-critical paths — it costs even when idle.
- Default to SNS-to-SQS fan-out so each consumer gets a buffered, independently retried copy.
- Store credentials in Secrets Manager with rotation; use KMS for data-at-rest encryption.
- Grant permissions through IAM roles with least-privilege policies scoped to specific ARNs.
- Let RTO and RPO drive your DR strategy — don’t pay for active-active when backup-and-restore meets the requirement.
- Always design across at least two Availability Zones and name the single points of failure you eliminated.