AWS Network Monitoring and Logging: Comprehensive Study Guide
Configuring network monitoring and logging by using AWS solutions
AWS Network Monitoring and Logging: Comprehensive Study Guide
This guide covers the essential tools and strategies for maintaining visibility, security, and performance across AWS networking environments, focusing on logging, real-time monitoring, and troubleshooting tools.
Learning Objectives
After studying this guide, you should be able to:
- Differentiate between VPC Flow Logs and VPC Traffic Mirroring use cases.
- Configure CloudWatch Alarms to trigger automated responses to network anomalies.
- Utilize Reachability Analyzer and Transit Gateway Network Manager for connectivity troubleshooting.
- Implement a centralized logging architecture for multi-account environments.
- Analyze VPC Flow Log fields to identify security threats and performance bottlenecks.
Key Terms & Glossary
- VPC Flow Logs: A feature that enables you to capture information about the IP traffic going to and from network interfaces in your VPC.
- Traffic Mirroring: A feature that allows you to copy network traffic from an elastic network interface (ENI) and send it to out-of-band security and monitoring appliances for deep packet inspection (DPI).
- CloudWatch Insights: An interactive query tool used to analyze log data using a specialized query language.
- Reachability Analyzer: A configuration analysis tool that enables you to perform connectivity testing between a source resource and a destination resource in your VPCs.
- AWS CloudTrail: A service that records API calls for your account, providing a history of "who did what" in your network infrastructure.
The "Big Idea"
Visibility is the cornerstone of the Well-Architected Framework. Without granular monitoring, network administrators are "blind" to silent failures (like routing loops) or stealthy threats (like data exfiltration). Effective AWS network monitoring moves from reactive troubleshooting to proactive optimization by correlating metadata (Flow Logs), full packet data (Mirroring), and configuration state (Reachability Analyzer).
Formula / Concept Box
| Feature | Key Metadata / Output | Typical Destination |
|---|---|---|
| VPC Flow Logs | 5-tuple (Src/Dest IP, Port, Protocol), Action, Status | CloudWatch Logs, S3, Kinesis Firehose |
| Traffic Mirroring | Full L2-L4 Packets (VXLAN encapsulated) | Monitoring Appliance (EC2), NLB |
| CloudTrail | API Caller Identity, Source IP, Request Parameters | S3, CloudWatch Logs |
| Route 53 Logs | Query Name, Type, Resolver IP, Response Code | CloudWatch Logs |
Hierarchical Outline
- I. Passive Monitoring (Metadata-based)
- VPC Flow Logs
- Base Fields: Basic traffic info (IPs, Ports, Protocols).
- Extended Fields: TCP flags, Packet/Byte counts, Flow direction.
- Transit Gateway Network Manager: Global view of private networks.
- VPC Flow Logs
- II. Active/Deep Monitoring (Packet-level)
- VPC Traffic Mirroring
- Source: The ENI to monitor.
- Target: The ENI or NLB receiving the copy.
- Filter: Rules defining which traffic to mirror (Inbound/Outbound).
- VPC Traffic Mirroring
- III. Analysis & Troubleshooting Tools
- Reachability Analyzer: Logic-based path validation (does not send packets).
- CloudWatch Alarms: Threshold-based alerting via SNS.
- CloudWatch Contributor Insights: Identifying "top talkers" or outliers.
- IV. Governance & Compliance
- AWS Config: Tracking configuration changes over time.
- AWS Trusted Advisor: Identifying security gaps (e.g., wide-open Security Groups).
Visual Anchors
Choosing a Monitoring Tool
Log Delivery Architecture
\begin{tikzpicture}[node distance=2cm, every node/.style={fill=white, font=\small}, align=center] % Define styles \draw[thick] (0,0) rectangle (2,1.5) node[pos=.5] {EC2 Instance$ENI)}; \draw[->, thick] (2,0.75) -- (4,0.75) node[midway, above] {Log Flow}; \draw[thick] (4,0) rectangle (6,1.5) node[pos=.5] {CloudWatch\Logs}; \draw[->, thick] (6,1) -- (8,2) node[above] {Metric Filter}; \draw[->, thick] (6,0.5) -- (8, -0.5) node[below] {Subscription Filter}; \draw[thick] (8,1.5) rectangle (10,2.5) node[pos=.5] {CloudWatch\Alarm}; \draw[thick] (8,-1) rectangle (10,0) node[pos=.5] {Amazon S3$Long-term)}; \draw[->, thick] (10,2) -- (11,2) node[right] {SNS / Lambda}; \end{tikzpicture}
Definition-Example Pairs
- Metric Filter → A mechanism to search for and transform log data into numerical metrics.
- Example: Creating a filter that counts every occurrence of "REJECT" in VPC Flow Logs to monitor for blocked connection attempts.
- VPC Traffic Mirroring Filter → A set of rules that determine which traffic is mirrored from the source to the target.
- Example: Only mirroring traffic on Port 443 (HTTPS) to a security appliance to check for encrypted malware signatures while ignoring other ports to save costs.
- Reachability Analyzer Path → A logical analysis of the path between a source and destination.
- Example: Confirming that a route exists in the Route Table and an 'Allow' rule exists in the NACL between an EC2 instance and an Internet Gateway.
Worked Examples
Scenario: Detecting SSH Brute Force Attacks
Goal: Alert when there are more than 50 rejected SSH attempts from a single IP in 5 minutes.
- Enable VPC Flow Logs: Ensure logs are flowing to CloudWatch Logs for the target subnet.
- Create Metric Filter:
- Filter Pattern:
[version, account_id, interface_id, srcaddr, dstaddr, srcport, dstport=22, protocol, packets, bytes, start, end, action="REJECT", log_status]
- Filter Pattern:
- Define Metric: Name the metric
SSH_Rejections. - Configure Alarm: Set the threshold to
> 50within a 5-minute period. - Action: Set the alarm to notify an SNS Topic that emails the Security Operations Center (SOC).
Checkpoint Questions
- Does Reachability Analyzer send actual data packets between resources to test connectivity?
- Which Flow Log field would you check to see if a packet was blocked by a Security Group vs. a NACL? (Trick question: Look at the
actionfield combined withsrc/dstinfo and the knowledge that NACLs are stateless). - What is the main advantage of using Kinesis Data Firehose as a destination for VPC Flow Logs?
- What protocol is used to encapsulate mirrored traffic in VPC Traffic Mirroring?
▶Click for Answers
- No, it uses automated reasoning to analyze the configuration path logically.
- In a standard flow log, you see 'REJECT'. If it's a NACL, you might see one-way traffic (if stateless); if it's a Security Group, it's stateful, so both directions are affected. Detailed analysis often requires Reachability Analyzer.
- It allows for real-time streaming to third-party providers (Splunk, Datadog) or transformation via Lambda before storage.
- VXLAN (Port 4789).
Muddy Points & Cross-Refs
- Flow Logs vs. CloudTrail: Beginners often confuse these. Remember: CloudTrail is about the Control Plane (Who called the API to delete the VPC?), while Flow Logs are about the Data Plane (What IP is talking to my instance?).
- Traffic Mirroring Costs: Traffic Mirroring is billed per hour per session. For high-throughput environments, this can be expensive. Always use Filters to limit mirrored traffic to only what is necessary.
- Reachability Analyzer Limitations: It cannot "see" inside your EC2 instance OS. If your Windows Firewall or Linux
iptablesis blocking traffic, Reachability Analyzer will say "Reachable" because the AWS infrastructure is configured correctly.
Comparison Tables
Troubleshooting Tool Comparison
| Feature | Reachability Analyzer | Network Manager (TGW) | VPC Flow Logs |
|---|---|---|---|
| Primary Use | Connectivity Debugging | Global Topology Visibility | Traffic Pattern Analysis |
| Method | Static Configuration Analysis | Route/Metric Aggregation | Metadata Capture |
| Real-time? | On-demand Request | Yes (Dashboard) | Near real-time (1-10 min lag) |
| Cross-Account? | Yes | Yes (via CloudWAN/TGW) | Yes (via Centralized S3/Logs) |