The CloudWatch Agent
When you launch an EC2 instance (a virtual server on AWS), CloudWatch automatically shows you graphs for CPU, network, and disk I/O. But notice what is missing: there is no graph for memory (RAM) usage and no graph for how full your disk actually is. That is not a bug. AWS can see those metrics from outside the virtual machine, but it cannot see inside the operating system to read RAM or filesystem usage. The CloudWatch agent (a small program you install on the server) fills that gap. It runs inside the OS, reads memory, disk space, and any log files you point it at, and ships them to CloudWatch.
Why the agent exists
CloudWatch metrics come from two places. Some are gathered by the AWS infrastructure itself (the hypervisor, the host that runs your virtual machine). The hypervisor can measure CPU and network because it controls those, but it has no way to know how much RAM a process inside your server is using or how full /var is. Those numbers only exist inside the operating system.
The CloudWatch agent is the official tool that runs inside the server, collects OS-level metrics and log files, and pushes them to CloudWatch using the same API as everything else.
The single most common gotcha: if a server shows no memory or disk graphs, it is almost never because “the metric doesn’t exist.” It is because the agent is not installed, or the instance’s IAM role lacks
cloudwatch:PutMetricDataand the CloudWatch Logs permissions. Check the agent first, not the metric name.
When to use this (and when not to)
| Need | Use the agent? |
|---|---|
| Memory / swap usage | Yes — defaults do not cover it |
Disk space used (% full) | Yes — defaults only give disk I/O, not free space |
Application or system log files (/var/log/...) | Yes |
| CPU, network, EBS I/O | No — already collected for free |
| Containers on ECS/EKS | Use Container Insights instead, not the raw agent |
| Short-lived Lambda functions | No — Lambda emits metrics and logs automatically |
Step 1 — Give the server permission
The agent talks to CloudWatch using an IAM role (a set of permissions attached to the instance, so you never store keys on the box). AWS provides a managed policy named CloudWatchAgentServerPolicy that grants exactly what the agent needs: PutMetricData, log creation, and read access to the config in SSM.
Console:
- Go to IAM → Roles → Create role.
- Trusted entity: AWS service → EC2.
- Attach the policy CloudWatchAgentServerPolicy.
- Name it
CloudWatchAgentRoleand create it. - Go to EC2 → Instances, select your instance, then Actions → Security → Modify IAM role, and attach the role.
CLI:
aws iam create-role \
--role-name CloudWatchAgentRole \
--assume-role-policy-document '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":{"Service":"ec2.amazonaws.com"},"Action":"sts:AssumeRole"}]}'
aws iam attach-role-policy \
--role-name CloudWatchAgentRole \
--policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
aws iam create-instance-profile --instance-profile-name CloudWatchAgentRole
aws iam add-role-to-instance-profile \
--instance-profile-name CloudWatchAgentRole --role-name CloudWatchAgentRole
aws ec2 associate-iam-instance-profile \
--instance-id i-0a1b2c3d4e5f \
--iam-instance-profile Name=CloudWatchAgentRole
Output:
{
"IamInstanceProfileAssociation": {
"AssociationId": "iip-assoc-0a1b2c3d4e5f",
"InstanceId": "i-0a1b2c3d4e5f",
"IamInstanceProfile": {
"Arn": "arn:aws:iam::123456789012:instance-profile/CloudWatchAgentRole",
"Id": "AIPA0A1B2C3D4E5F"
},
"State": "associating"
}
}
Step 2 — Install the agent
The easiest way to install and manage the agent at scale is AWS Systems Manager (SSM) — a service that runs commands on your servers without you logging in via SSH. Modern Amazon Linux and Ubuntu AMIs already have the SSM agent. SSM provides a ready-made document, AWS-ConfigureAWSPackage, that installs the CloudWatch agent package.
Console:
- Go to Systems Manager → Run Command.
- Choose the document AWS-ConfigureAWSPackage.
- Action:
Install, Name:AmazonCloudWatchAgent. - Pick your instance as the target and click Run.
CLI:
aws ssm send-command \
--document-name "AWS-ConfigureAWSPackage" \
--instance-ids i-0a1b2c3d4e5f \
--parameters '{"action":["Install"],"name":["AmazonCloudWatchAgent"]}'
Tip: If you cannot use SSM, you can install manually with
yum install amazon-cloudwatch-agent(Amazon Linux) or by downloading the.deb/.rpmfrom the regional S3 bucket. SSM is preferred because it scales to hundreds of servers and keeps them consistent.
Step 3 — Write a config file
The agent does nothing until you tell it what to collect. The config is a JSON file. Below is a realistic example that collects memory, disk usage, and an application log file.
{
"metrics": {
"namespace": "CWAgent",
"metrics_collected": {
"mem": {
"measurement": ["mem_used_percent"]
},
"disk": {
"measurement": ["used_percent"],
"resources": ["/"]
}
}
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/myapp/app.log",
"log_group_name": "/myapp/app",
"log_stream_name": "{instance_id}"
}
]
}
}
}
}
Store this config in the SSM Parameter Store (a free key-value store for config) so every server can pull the same copy:
aws ssm put-parameter \
--name "AmazonCloudWatch-myapp-config" \
--type String \
--value file://amazon-cloudwatch-agent.json
Step 4 — Start the agent with that config
Use SSM Run Command with the AmazonCloudWatch-ManageAgent document and point it at the parameter you just stored.
aws ssm send-command \
--document-name "AmazonCloudWatch-ManageAgent" \
--instance-ids i-0a1b2c3d4e5f \
--parameters '{"action":["configure"],"mode":["ec2"],"optionalConfigurationSource":["ssm"],"optionalConfigurationLocation":["AmazonCloudWatch-myapp-config"],"optionalRestart":["yes"]}'
Output:
{
"Command": {
"CommandId": "9a8b7c6d-0a1b-2c3d-4e5f-6a7b8c9d0e1f",
"DocumentName": "AmazonCloudWatch-ManageAgent",
"Status": "Pending",
"TargetCount": 1
}
}
Within a minute or two you will see a new metric namespace called CWAgent in CloudWatch, with mem_used_percent and disk_used_percent, plus a log group named /myapp/app.
Cost note
The agent software is free. You pay only for what it sends: custom metrics cost about $0.30 per metric per month, and ingested logs cost about $0.50 per GB. Collecting mem and disk on 10 servers is a handful of metrics — a few dollars a month. The expensive mistake is shipping verbose debug logs at high volume; filter logs at the source before they reach CloudWatch.
Best practices
- Always attach an IAM role with
CloudWatchAgentServerPolicybefore installing — most “missing graph” tickets trace back to a missing or wrong role. - Store the config in SSM Parameter Store so every server is identical and updates are one command away.
- Use a wildcard or
nvme*for thediskresourceslist if instances have multiple volumes, so you do not silently miss a full data disk. - Set log group retention (for example 30 days) on every log group the agent creates — logs default to “never expire” and quietly grow your bill.
- Bake the agent and config into your AMI or launch template so new instances report from the first boot, not after a manual fix.
- Alarm on
mem_used_percentanddisk_used_percent; a server that runs out of disk fails in confusing ways, and these are the metrics that warn you first.