Lab: Troubleshooting & Analyzing AWS Network Traffic Patterns
Monitor and analyze network traffic to troubleshoot and optimize connectivity patterns
Lab: Troubleshooting & Analyzing AWS Network Traffic Patterns
In this lab, you will learn how to monitor and analyze network traffic using AWS-native tools. You will enable VPC Flow Logs, perform automated connectivity testing using Reachability Analyzer, and query traffic patterns using CloudWatch Logs Insights.
[!WARNING] Remember to run the teardown commands at the end of the lab to avoid ongoing charges for CloudWatch and VPC Flow Logs.
Prerequisites
- An AWS Account with permissions to manage VPC, EC2, and CloudWatch Logs.
- AWS CLI installed and configured with credentials for
<YOUR_REGION>. - A pre-existing VPC (you can use your Default VPC for this lab).
- Basic familiarity with CIDR notation and Security Groups.
Learning Objectives
- Configure and enable VPC Flow Logs to a CloudWatch destination.
- Use Reachability Analyzer to diagnose connectivity gaps without sending actual packets.
- Execute CloudWatch Logs Insights queries to identify top talkers and rejected traffic.
- Understand how to map network topology to physical flow constraints.
Architecture Overview
This lab uses a simple inspection architecture where traffic flow from a source to a destination is captured and analyzed.
Step-by-Step Instructions
Step 1: Create a CloudWatch Log Group
Before enabling Flow Logs, you need a destination to store the data.
aws logs create-log-group --log-group-name "brainybee-lab-flowlogs"▶Console alternative
- Navigate to CloudWatch > Logs > Log groups.
- Click Create log group.
- Name it
brainybee-lab-flowlogsand click Create.
Step 2: Create an IAM Role for Flow Logs
Flow Logs require permission to publish to CloudWatch Logs.
# Note: This is a simplified command sequence for the lab
aws iam create-role --role-name FlowLogRole --assume-role-policy-document '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":{"Service":"vpc-flow-logs.amazonaws.com"},"Action":"sts:AssumeRole"}]}'
aws iam put-role-policy --role-name FlowLogRole --policy-name FlowLogPolicy --policy-document '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Action":["logs:CreateLogGroup","logs:CreateLogStream","logs:PutLogEvents","logs:DescribeLogGroups","logs:DescribeLogStreams"],"Resource":"*"}]}'Step 3: Enable VPC Flow Logs
We will now capture all traffic (Accept and Reject) for your VPC.
# Get your VPC ID first
VPC_ID=$(aws ec2 describe-vpcs --filter "Name=is-default,Values=true" --query "Vpcs[0].VpcId" --output text)
# Enable the Flow Log
aws ec2 create-flow-logs \
--resource-type VPC \
--resource-ids $VPC_ID \
--traffic-type ALL \
--log-group-name "brainybee-lab-flowlogs" \
--deliver-logs-permission-arn arn:aws:iam::<YOUR_ACCOUNT_ID>:role/FlowLogRole[!TIP] In production, you might only log
REJECTtraffic to save on storage costs while troubleshooting.
Step 4: Perform a Reachability Analysis
If you have a connectivity issue (e.g., an EC2 instance cannot be reached on port 80), use the Reachability Analyzer to find the root cause.
aws ec2 create-network-insights-path \
--source <INSTANCE_ID_A> \
--destination <INSTANCE_ID_B> \
--protocol tcp \
--destination-port 80▶Console alternative
- Navigate to VPC Console > Network Analysis > Reachability Analyzer.
- Click Create and analyze path.
- Select your source and destination instances.
- Click Create and analyze.
Checkpoints
- Log Ingestion: Go to CloudWatch Logs. Do you see log streams appearing in
brainybee-lab-flowlogs? (Wait 5-10 minutes for the first batch). - Analysis Result: In Reachability Analyzer, did the status change to Reachable or Unreachable? If unreachable, look at the visual diagram provided by AWS to see which Security Group or NACL is blocking traffic.
Analysis & Insights
To identify "Top Talkers" in your VPC, use this query in CloudWatch Logs Insights:
filter action="ACCEPT"
| stats sum(bytes) as totalBytes by srcAddr, dstAddr
| sort totalBytes desc
| limit 10Concept Review
| Tool | Primary Use Case | OSI Layer |
|---|---|---|
| VPC Flow Logs | Historical analysis, security auditing, "Top Talker" identification | Layer 3 / 4 |
| Reachability Analyzer | Troubleshooting misconfigurations (SGs, NACLs, Route Tables) | Control Plane |
| Traffic Mirroring | Deep packet inspection (DPI), IDS/IPS integration | Layer 2 - 7 |
| CloudWatch Metrics | Monitoring bandwidth utilization and packet drops | Aggregated |
Troubleshooting
| Error | Cause | Fix |
|---|---|---|
Log Group not found | Flow Log created before Log Group | Create the Log Group first in the same region. |
Access Denied to Logs | IAM Role missing permissions | Verify the Trust Policy allows vpc-flow-logs.amazonaws.com. |
| Reachability "Pending" | AWS back-end analysis | Wait 2-3 minutes; complex paths take longer. |
Challenge
The "Stealthy Dropper" Challenge: Create a Network ACL that blocks all outbound traffic on Port 443. Then, run a Reachability Analysis from your instance to 0.0.0.0/0 on port 443. Can you identify the exact NACL Rule ID that causes the failure using only the Reachability Analyzer output?
Cost Estimate
- VPC Flow Logs: $0.50 per GB of data collected (Varies by region).
- CloudWatch Logs Storage: $0.03 per GB per month.
- Reachability Analyzer: $0.10 per analysis.
- Total for this lab: Likely < $0.20 if deleted immediately.
Clean-Up / Teardown
# 1. Delete Flow Logs (Find the ID first)
FLOW_LOG_ID=$(aws ec2 describe-flow-logs --query "FlowLogs[?LogGroupName=='brainybee-lab-flowlogs'].FlowLogId" --output text)
aws ec2 delete-flow-logs --flow-log-ids $FLOW_LOG_ID
# 2. Delete Log Group
aws logs delete-log-group --log-group-name "brainybee-lab-flowlogs"
# 3. Delete IAM Role & Policy
aws iam delete-role-policy --role-name FlowLogRole --policy-name FlowLogPolicy
aws iam delete-role --role-name FlowLogRole