Skip to content
AWS aws monitoring 5 min read

What is CloudWatch?

Amazon CloudWatch is the central observability service in AWS. It collects and stores metrics (numbers that change over time, like CPU usage), logs (text output from your apps and AWS services), and events (signals that something happened), and lets you turn all of that into alarms and dashboards. If you want to know whether your systems are healthy, why something broke, or when to scale, CloudWatch is almost always the first place you look. Think of it as the eyes and ears for everything running in your AWS account.

What CloudWatch actually does

CloudWatch is really five tightly connected features. Understanding the pieces helps you know which one to reach for.

FeatureWhat it storesTypical use
MetricsTime-series numbers (e.g. CPUUtilization)Track health, trigger scaling
LogsRaw text/JSON log linesDebugging, audit, search
AlarmsRules on a metricGet paged or auto-act when a threshold is crossed
DashboardsVisual chartsA single screen showing system health
Events (EventBridge)Notifications of changes”Run this when an instance stops”

A metric in CloudWatch lives inside a namespace (a folder, like AWS/EC2) and is identified by dimensions (key-value labels, like InstanceId=i-0a1b2c3d4e5f). Metrics are summarized into statistics (Average, Sum, Maximum, etc.) over a chosen period (the time bucket, e.g. 60 seconds).

What is collected automatically vs what needs the agent

This is the single most important thing beginners miss. Many AWS services publish some metrics to CloudWatch for free, with no setup. But the metrics you most want during an incident are often not there by default.

Collected automatically (no agent):

  • EC2 (a virtual server) basic metrics: CPUUtilization, NetworkIn/Out, disk I/O, status checks — at 5-minute intervals by default (or 1-minute “detailed monitoring” for a small fee).
  • Managed services like RDS (managed databases), Lambda (run code without servers), ELB (load balancers), S3 (object storage), and DynamoDB publish rich metrics on their own.

NOT collected by default — you must add it yourself:

  • Memory usage and disk space used on an EC2 instance. AWS cannot see inside the operating system, so these require the CloudWatch agent (a small program you install on the server).
  • Custom application metrics (orders per minute, queue depth, login failures). You publish these yourself with the PutMetricData API.
  • Application and system logs from inside an instance also need the agent (or the older awslogs agent) to ship them to CloudWatch Logs.

Gotcha: A brand-new EC2 instance will happily report 5% CPU while it is actually out of memory and swapping to death. Without the CloudWatch agent, you are blind to memory and disk. Install the agent on anything you run yourself.

When to use the agent vs PutMetricData

  • Use the CloudWatch agent when you need OS-level signals (memory, disk, processes) or want to ship log files — it’s configuration, not code.
  • Use PutMetricData when your application knows something AWS can’t, like business KPIs. You call it directly from your code.
  • Do NOT publish every tiny thing as a custom metric — see the cost note below.

Publishing a custom metric

AWS CLI

aws cloudwatch put-metric-data \
  --namespace "MyApp/Orders" \
  --metric-name OrdersPlaced \
  --unit Count \
  --value 1 \
  --dimensions Environment=prod,Service=checkout

Output:

(no output on success — exit code 0)

You can then read it back:

aws cloudwatch get-metric-statistics \
  --namespace "MyApp/Orders" \
  --metric-name OrdersPlaced \
  --dimensions Name=Environment,Value=prod Name=Service,Value=checkout \
  --start-time 2026-06-15T00:00:00Z \
  --end-time 2026-06-15T01:00:00Z \
  --period 300 \
  --statistics Sum

Output:

{
    "Label": "OrdersPlaced",
    "Datapoints": [
        {
            "Timestamp": "2026-06-15T00:30:00Z",
            "Sum": 42.0,
            "Unit": "Count"
        }
    ]
}

Console steps

  1. Open the CloudWatch console.
  2. In the left menu choose Metrics > All metrics.
  3. Pick your custom namespace (e.g. MyApp/Orders) to browse the metrics you published.
  4. Select a metric to chart it, then use Actions > Add to dashboard or Create alarm.

Tip: Custom metrics show up in the console only after the first data point is published. There is no “create metric” button — you create a metric simply by sending data to a new namespace/name combination.

Watching the cost

CloudWatch is cheap to start with but bills can creep up quietly because it charges per metric, per dashboard, and per gigabyte of logs. Rough 2026 US pricing to keep in mind:

ItemApprox. cost
Custom metric~$0.30 per metric per month
Dashboard~$3 per dashboard per month (first 3 free)
Logs ingestion~$0.50 per GB ingested
Logs storage~$0.03 per GB per month
API requests (PutMetricData)~$0.01 per 1,000 calls

The classic mistake is creating a custom metric per user or per request ID by putting a high-cardinality value in a dimension. Ten thousand users can quietly become ten thousand metrics, which is ~$3,000/month. Keep dimensions low-cardinality (environment, service, region) and put the variable detail in logs instead.

Cost warning: High-cardinality dimensions are the number-one source of surprise CloudWatch bills. Use logs (and CloudWatch Logs Insights to query them) for per-request detail, not metrics.

Best practices

  • Install the CloudWatch agent on every self-managed EC2 instance so you capture memory and disk, not just CPU.
  • Keep metric dimensions low-cardinality — never use user IDs, request IDs, or timestamps as dimension values.
  • Set log retention explicitly on every log group; the default is “never expire,” which silently grows your storage bill.
  • Put business detail in logs, system health in metrics — query logs with Logs Insights instead of minting thousands of custom metrics.
  • Build a small number of focused dashboards (the first three are free) rather than one giant catch-all.
  • Turn on detailed (1-minute) monitoring only where faster reaction time justifies the per-instance cost.
  • Always pair an important metric with an alarm so a human or automation is notified — a metric nobody watches is wasted money.
Last updated June 15, 2026
Was this helpful?