Lab: Troubleshooting Security Monitoring and Logging in AWS
Troubleshoot security monitoring, logging, and alerting solutions
Lab: Troubleshooting Security Monitoring and Logging in AWS
This hands-on lab focuses on identifying and remediating common misconfigurations in AWS logging and alerting pipelines. You will take on the role of a Security Engineer tasked with fixing a broken monitoring chain where logs are failing to ingest and alerts are not reaching the security team.
[!WARNING] This lab involves creating resources that may incur costs. Remember to run the teardown commands at the end to avoid ongoing charges.
Prerequisites
- An AWS Account with Administrator access.
- AWS CLI installed and configured with appropriate credentials.
- Basic familiarity with JSON and Bash/PowerShell.
- Access to a terminal or shell environment.
Learning Objectives
- Analyze IAM permissions causing "Access Denied" errors in CloudWatch Logs.
- Remediate misconfigured CloudWatch Log Metric Filters using regex matching.
- Troubleshoot SNS Access Policies that prevent CloudWatch Alarms from sending notifications.
- Validate the end-to-end security alerting pipeline.
Architecture Overview
The following diagram illustrates the logging pipeline you will troubleshoot. The lab starts with a "broken" state where several points of failure prevent the alert from reaching the final destination.
Step-by-Step Instructions
Step 1: Create the Broken Logging Infrastructure
Before we can troubleshoot, we must deploy the resources. We will intentionally create an IAM role with insufficient permissions for a service to write logs.
# 1. Create a Log Group
aws logs create-log-group --log-group-name "/aws/security/lab-errors"
# 2. Create an SNS Topic for Alerts
aws sns create-topic --name "SecurityLabAlerts"▶Console alternative
- Open the CloudWatch Console.
- In the left navigation, choose Logs > Log Groups.
- Click Create log group and name it
/aws/security/lab-errors. - Open the SNS Console, choose Topics > Create topic, select Standard, and name it
SecurityLabAlerts.
Step 2: Identify and Fix IAM Logging Permissions
In this scenario, a service (simulated by the CLI) is unable to create log streams.
The Problem: You try to push a log entry, but it fails.
aws logs put-log-events --log-group-name "/aws/security/lab-errors" --log-stream-name "TestStream" --log-events timestamp=$(date +%s%3N),message="UNAUTHORIZED_ACCESS_ATTEMPT"Note: If the stream doesn't exist, this will error. But even if created, the IAM user/role must have logs:PutLogEvents.
The Fix: Attach a policy to your lab user/role that allows logs:CreateLogStream and logs:PutLogEvents for the specific log group.
# Create the policy file
echo '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["logs:CreateLogStream", "logs:PutLogEvents"],
"Resource": "arn:aws:logs:*:*:log-group:/aws/security/lab-errors:*"
}
]
}' > logging-policy.json
# Apply the policy (Replace <YOUR_USER_NAME>)
aws iam put-user-policy --user-name <YOUR_USER_NAME> --policy-name "LoggingFix" --policy-document file://logging-policy.jsonStep 3: Troubleshoot the Metric Filter
A Metric Filter scans logs for patterns. We want to catch "UNAUTHORIZED". A common mistake is using case-sensitive patterns or incorrect syntax.
Faulty Filter Creation:
aws logs put-metric-filter \
--log-group-name "/aws/security/lab-errors" \
--filter-name "UnauthorizedAccess" \
--filter-pattern "unauthorized" \
--metric-transformations metricName=UnauthorizedCount,metricNamespace=SecurityLab,metricValue=1[!IMPORTANT] CloudWatch log patterns are case-sensitive. The log we pushed was
UNAUTHORIZED_ACCESS_ATTEMPT. The filter"unauthorized"will not match.
The Fix: Update the filter to use a case-insensitive or exact match string "UNAUTHORIZED".
aws logs put-metric-filter \
--log-group-name "/aws/security/lab-errors" \
--filter-name "UnauthorizedAccess" \
--filter-pattern "UNAUTHORIZED" \
--metric-transformations metricName=UnauthorizedCount,metricNamespace=SecurityLab,metricValue=1Step 4: Resolve SNS Access Policy Restrictions
Even if the Alarm triggers, it cannot notify SNS if the SNS Topic Policy doesn't allow cloudwatch.amazonaws.com to publish.
The Fix: Update the SNS Topic Policy to allow the CloudWatch service principal.
# Replace <SNS_TOPIC_ARN> and <ACCOUNT_ID>
aws sns set-topic-attributes --topic-arn <SNS_TOPIC_ARN> --attribute-name Policy --attribute-value '{
"Version": "2008-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {"Service": "cloudwatch.amazonaws.com"},
"Action": "SNS:Publish",
"Resource": "<SNS_TOPIC_ARN>"
}
]
}'Visual Breakdown: Log Processing Logic
The following TikZ diagram shows the internal logic gate used by CloudWatch when evaluating incoming log events against your filters.
\begin{tikzpicture}[node distance=2cm, auto] \draw[thick, fill=blue!10] (0,0) rectangle (3,1) node[pos=.5] {Log Event}; \draw[->, thick] (3,0.5) -- (4.5,0.5); \draw[thick, fill=green!10] (4.5,-0.5) rectangle (7.5,1.5) node[pos=.5, text width=2.5cm, align=center] {Pattern Matcher\ (Case Sensitive)}; \draw[->, thick] (7.5,0.5) -- (9,1.5) node[above] {Match!}; \draw[->, thick] (7.5,0.5) -- (9,-0.5) node[below] {Discard}; \draw[thick] (9,1.5) -- (11,1.5) node[right] {Metric +1}; \draw[thick] (9,-0.5) -- (11,-0.5) node[right] {No Action}; \end{tikzpicture}
Checkpoints
- Verify Log Ingestion: Run the following and ensure you see results without error.
bash
aws logs describe-log-streams --log-group-name "/aws/security/lab-errors" - Verify Metric Generation: Push a log event and check if the metric count increases in the CloudWatch Metrics console.
- Verify Alarm State:
Thebash
aws cloudwatch describe-alarms --alarm-names "SecurityLabAlarm"StateValueshould eventually move fromOKorINSUFFICIENT_DATAtoALARMif you send multiple failed attempts.
Troubleshooting
| Problem | Potential Cause | Fix |
|---|---|---|
An error occurred (AccessDeniedException) | Missing IAM permissions on the User/Role. | Attach a policy with logs:PutLogEvents and logs:CreateLogStream. |
| Metric stays at 0 | Filter pattern does not match the log text. | Check for case sensitivity and quotes in the pattern. |
Alarm is ALARM but no SNS email | SNS Topic Policy or Subscription issue. | Ensure the SNS policy allows cloudwatch.amazonaws.com and your email is confirmed. |
Clean-Up / Teardown
To avoid costs, delete the resources created in this lab.
# Delete Log Group
aws logs delete-log-group --log-group-name "/aws/security/lab-errors"
# Delete SNS Topic
aws sns delete-topic --topic-arn <YOUR_SNS_TOPIC_ARN>
# Delete IAM Policy (Detach first if needed)
aws iam delete-user-policy --user-name <YOUR_USER_NAME> --policy-name "LoggingFix"