Event-Driven Monitoring & Alerting
Most AWS services announce important things as they happen: an EC2 instance changed state, GuardDuty found a threat, AWS Health flagged a maintenance window. Amazon EventBridge is a serverless event bus that listens for these announcements and lets you act on them the instant they occur. This page shows how to catch those events with rules and route them to SNS for alerts, Lambda for custom logic, or Systems Manager (SSM) for automatic fixes. EventBridge gives you push-based, event-driven monitoring that complements the polling-based metric alarms in CloudWatch.
What EventBridge is and how it fits
Amazon EventBridge (formerly CloudWatch Events) is a fully managed event router. AWS services continuously emit events (small JSON documents describing “something happened”) onto a default event bus. You write rules that match a subset of those events using an event pattern, and each rule sends matching events to one or more targets (the thing that does the work, like an SNS topic or a Lambda function).
The key idea: this is push-based. The moment GuardDuty raises a finding, the event lands on the bus and your rule fires within seconds. You are not polling or waiting for a metric to be aggregated.
Why this matters: CloudWatch alarms watch trends in numeric metrics (e.g. “CPU above 80% for 5 minutes”) on a polling cycle. EventBridge reacts to discrete events (e.g. “an instance was terminated”, “a finding was created”) immediately. They are not competitors — you use both.
EventBridge vs CloudWatch alarms — when to use which
| Need | Use EventBridge | Use CloudWatch alarm |
|---|---|---|
| React to a one-time event (a finding fired, an instance stopped) | Yes | No |
| React to a sustained numeric trend (CPU high for 5 min) | No | Yes |
| Trigger auto-remediation (run an SSM document) | Yes (native target) | Indirectly (via SNS → Lambda) |
| Detection model | Push (instant) | Poll (per evaluation period) |
| Typical latency | Seconds | One or more evaluation periods |
When to use this: you want to act on a specific thing happening — a security finding, a config drift, a scheduled health event. When NOT to: you want to know if a value has been too high or too low over time — that is a CloudWatch alarm.
Catching an EC2 state change and alerting via SNS
Say you want an email the moment any EC2 instance enters the stopped state. The target is an SNS topic (Simple Notification Service — a pub/sub service that fans out messages to email, SMS, or other endpoints).
Console steps
- Open the EventBridge console and choose Rules → Create rule.
- Name it
ec2-stopped-alert, leave Event bus asdefault, keep rule type as Rule with an event pattern. - Under Event source, choose AWS events. For the event pattern, pick AWS services → EC2 → EC2 Instance State-change Notification, then set state to
stopped. - For Target, choose AWS service → SNS topic and select your topic (e.g.
arn:aws:sns:us-east-1:111122223333:ops-alerts). - Choose Create rule.
CLI equivalent
aws events put-rule \
--name ec2-stopped-alert \
--event-pattern '{
"source": ["aws.ec2"],
"detail-type": ["EC2 Instance State-change Notification"],
"detail": { "state": ["stopped"] }
}'
aws events put-targets \
--rule ec2-stopped-alert \
--targets "Id"="1","Arn"="arn:aws:sns:us-east-1:111122223333:ops-alerts"
Output:
{
"RuleArn": "arn:aws:events:us-east-1:111122223333:rule/ec2-stopped-alert"
}
{
"FailedEntryCount": 0,
"FailedEntries": []
}
The topic policy must allow EventBridge to publish. Add this statement to the SNS topic’s access policy so the rule can deliver:
{
"Sid": "AllowEventBridgePublish",
"Effect": "Allow",
"Principal": { "Service": "events.amazonaws.com" },
"Action": "sns:Publish",
"Resource": "arn:aws:sns:us-east-1:111122223333:ops-alerts"
}
Auto-remediating a GuardDuty finding with Lambda
GuardDuty is AWS’s threat-detection service. When it raises a finding (for example, an EC2 instance making suspicious outbound calls), you can isolate that instance automatically instead of waiting for a human.
The pattern: GuardDuty finding → EventBridge rule → Lambda function that swaps the instance’s security group for a locked-down “quarantine” group.
aws events put-rule \
--name guardduty-high-severity \
--event-pattern '{
"source": ["aws.guardduty"],
"detail-type": ["GuardDuty Finding"],
"detail": { "severity": [{ "numeric": [">=", 7] }] }
}'
aws events put-targets \
--rule guardduty-high-severity \
--targets "Id"="1","Arn"="arn:aws:lambda:us-east-1:111122223333:function:quarantine-instance"
aws lambda add-permission \
--function-name quarantine-instance \
--statement-id eventbridge-guardduty \
--action lambda:InvokeFunction \
--principal events.amazonaws.com \
--source-arn arn:aws:events:us-east-1:111122223333:rule/guardduty-high-severity
Output:
{
"Statement": "{\"Sid\":\"eventbridge-guardduty\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"events.amazonaws.com\"},\"Action\":\"lambda:InvokeFunction\",\"Resource\":\"arn:aws:lambda:us-east-1:111122223333:function:quarantine-instance\"}"
}
The Lambda receives the finding as JSON; it reads detail.resource.instanceDetails.instanceId (e.g. i-0a1b2c3d4e5f) and calls ec2:ModifyInstanceAttribute to apply the quarantine security group sg-0a1b2c3d.
Gotcha: EventBridge retries a failed target for up to 24 hours, but events can be dropped after that. For finding-driven security automation, add a dead-letter queue (an SQS queue that captures undelivered events) to every critical target so nothing silently disappears.
Running SSM automation directly (no Lambda)
For many fixes you do not need code at all. EventBridge can invoke an SSM Automation document (a predefined runbook) as a native target. For an AWS Health scheduled-maintenance event, you might trigger the AWS-managed AWS-StopEC2Instance document, or a custom one that drains a load balancer first.
CloudFormation snippet
Resources:
HealthEventRule:
Type: AWS::Events::Rule
Properties:
Name: health-scheduled-change
EventPattern:
source:
- aws.health
detail-type:
- AWS Health Event
detail:
eventTypeCategory:
- scheduledChange
Targets:
- Id: ssm-runbook
Arn: arn:aws:ssm:us-east-1:111122223333:automation-definition/AWS-StopEC2Instance:$DEFAULT
RoleArn: arn:aws:iam::111122223333:role/EventBridgeSSMRole
Cost note: EventBridge charges nothing for AWS service events on the default bus — you only pay for custom and partner events ($1.00 per million in 2026). The cost lives in the targets: Lambda invocations, SNS messages, and SSM automation steps. For most alerting workloads this is cents per month.
Best practices
- Match narrowly. Add
detailfilters (state, severity, region) so a rule only fires on what you care about, reducing noise and target cost. - Attach a dead-letter queue to every important target so retries that fail after 24 hours are not lost.
- Use SNS for human notification, Lambda for custom logic, and SSM Automation for parameterized remediation that you want to audit.
- Pair EventBridge with CloudWatch alarms: EventBridge for discrete events, alarms for sustained trends.
- Test event patterns with the Sandbox in the EventBridge console (paste a sample event and confirm it matches) before going live.
- Grant least-privilege IAM roles to targets — the SSM/Lambda role should only allow the exact remediation actions needed.
- Send a copy of critical events to CloudWatch Logs as an extra target for a durable audit trail.