Study Guide920 words

Capturing Baseline Network Performance

Capturing baseline network performance

Capturing Baseline Network Performance

Establishing a performance baseline is a critical task for AWS Network Engineers. It provides the "normal" profile of network behavior, allowing for proactive troubleshooting, capacity planning, and SLA validation.

Learning Objectives

  • Define the role of baselining in network monitoring and capacity planning.
  • Identify key AWS services used to collect performance data, including CloudWatch, VPC Flow Logs, and Transit Gateway Network Manager.
  • Explain the process and requirements for VPC Traffic Mirroring and deep packet inspection.
  • Compare different monitoring tools based on their level of visibility (flow-level vs. packet-level).

Key Terms & Glossary

  • Network Baseline: A set of metrics representing the normal operating state of a network over a specific period.
  • Promiscuous Mode: A configuration of a network interface that allows it to receive all traffic on a network segment, regardless of the destination MAC address.
  • Jitter: The variation in the delay of received packets, often critical for voice and video traffic.
  • Throughput: The actual amount of data successfully transferred over the network in a given time period.
  • Flow Logs: Metadata records that capture information about IP traffic going to and from network interfaces in a VPC.

The "Big Idea"

[!IMPORTANT] Baselines are the yardsticks of the cloud. Without knowing what is "normal," you cannot identify what is "broken." A baseline transforms raw metrics into actionable intelligence by highlighting anomalies that signify security breaches, misconfigurations, or the need for increased capacity.

Formula / Concept Box

ConceptMetric / RequirementPurpose
UtilizationCurrent TrafficMax Bandwidth×100\frac{\text{Current Traffic}}{\text{Max Bandwidth}} \times 100Identify saturation points and bottlenecks.
Packet LossPackets SentPackets ReceivedPackets Sent\frac{\text{Packets Sent} - \text{Packets Received}}{\text{Packets Sent}}Measure link reliability and congestion.
Mirror TargetENI in Promiscuous ModeRequired for receiving mirrored packet data.
CloudWatch AlarmsMetric>Threshold\text{Metric} > \text{Threshold}Automate response to baseline deviations.

Hierarchical Outline

  • I. The Importance of Baselines
    • Usage Tracking: Understanding usage patterns over time (daily, weekly, monthly).
    • Anomaly Detection: Identifying metrics that exceed baseline ranges to trigger resolutions.
    • Predictive Maintenance: Addressing issues before they become critical failures.
  • II. AWS Native Monitoring Tools
    • Amazon CloudWatch: Collects NetworkIn/Out and NetworkPacketsIn/Out metrics.
    • Transit Gateway Network Manager: Provides visibility into packet loss, latency, and global topology.
    • Route 53 Resolver Logs: Monitors DNS query latency and resolution failure rates.
  • III. Deep Packet Inspection (DPI)
    • VPC Traffic Mirroring: Copies L2 traffic from a source ENI to a target device.
    • Analysis Tools: Using Wireshark for inspecting source/destination IPs and protocols.
    • QoS Adjustments: Using findings to prioritize delay-sensitive traffic (e.g., Voice vs. Storage).

Visual Anchors

Traffic Mirroring Architecture

Loading Diagram...

Visualizing Performance Spikes

\begin{tikzpicture}[scale=0.8] % Axes \draw[->] (0,0) -- (6,0) node[right] {Time}; \draw[->] (0,0) -- (0,4) node[above] {Traffic Volume};

code
% Baseline (Dashed) \draw[dashed, blue, thick] (0,1) .. controls (1,1.2) and (2,0.8) .. (3,1) .. controls (4,1.2) and (5,0.8) .. (6,1); \node[blue] at (5,0.5) {Baseline}; % Actual Traffic (Solid) \draw[red, thick] (0,0.8) -- (2,0.9) -- (2.5,3.5) -- (3,1.2) -- (4,1.1) -- (6,1); \node[red] at (2.5,3.8) {Anomaly}; % Threshold line \draw[thick, gray] (0,2.5) -- (6,2.5); \node[gray] at (5.5,2.7) {SLA};

\end{tikzpicture}

Definition-Example Pairs

  • VPC Flow Logs: Metadata capture of IP traffic flows.
    • Example: Checking if a specific Security Group is dropping traffic by looking for REJECT records in the flow logs.
  • Transit Gateway Network Manager: A centralized dashboard for global network health.
    • Example: Visualizing a 50ms latency spike between a VPC in us-east-1 and an on-premises data center via Direct Connect.
  • Packet Shaping: Modifying the flow of data to optimize performance.
    • Example: Applying Quality of Service (QoS) rules to ensure VoIP packets are processed before background database backups.

Worked Examples

Example 1: Calculating Baseline Deviation

Scenario: An EC2 instance usually has a NetworkOut average of 500 MB/hour. Suddenly, CloudWatch reports 5 GB/hour.

  1. Identify Baseline: 500 MB/hour.
  2. Compare Current: 5000 MB/hour.
  3. Calculation: The current load is $10 \times$ the baseline.
  4. Action: Investigate for data exfiltration or a misconfigured backup job.

Example 2: Configuring Traffic Mirroring

Scenario: You need to inspect packets for an application that is intermittently dropping connections.

  1. Create Target: Deploy an EC2 instance with an ENI in the same VPC.
  2. Create Filter: Define a filter for the specific port and protocol used by the app.
  3. Create Session: Map the Source ENI to the Target ENI using the filter.
  4. Capture: Run tcpdump or Wireshark on the target instance to see the raw payloads.

Checkpoint Questions

  1. What is the main difference between VPC Flow Logs and VPC Traffic Mirroring?
  2. Why must the destination interface for Traffic Mirroring be in promiscuous mode?
  3. Which tool would you use to map the global topology of your AWS Transit Gateways?
  4. If an application is latency-sensitive, which metric in Transit Gateway Network Manager is most critical?

Muddy Points & Cross-Refs

  • Flow Logs vs. Mirroring: Flow logs are cheap and capture metadata (IP/Port), whereas Mirroring is more expensive/complex but captures the actual data inside the packets.
  • Promiscuous Mode: Many students forget that the target instance OS must also support promiscuous mode to "see" the traffic redirected to it.
  • Cross-Refs: See Chapter 6: Security for using Flow Logs in threat detection, and Unit 1: Design for implementing Direct Connect.

Comparison Tables

FeatureVPC Flow LogsVPC Traffic MirroringCloudWatch Metrics
Data TypeMetadata (Flows)Full Packet (Payload)Aggregated Metrics
Granularity1 min / 10 minReal-time1 min (Standard)
Use CaseSecurity / ConnectivityDeep TroubleshootingCapacity Planning
CostLowHighMedium
Analysis ToolCloudWatch InsightsWireshark / SuricataCloudWatch Dashboards

Ready to study AWS Certified Advanced Networking - Specialty (ANS-C01)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free