AWS Global Infrastructure: Design for Reliability and Performance
AWS Global Infrastructure
AWS Global Infrastructure: Design for Reliability and Performance
This study guide explores the foundational layers of the AWS Global Infrastructure, emphasizing how to leverage its hierarchical structure to achieve fault isolation, low latency, and high availability as required for the AWS Certified Solutions Architect - Professional (SAP-C02) exam.
Learning Objectives
- Analyze the hierarchy of AWS infrastructure from Regions down to Edge Locations.
- Evaluate the difference between zonal and regional services and their impact on fate-sharing.
- Design architectures that minimize blast radius through fault-isolated boundaries.
- Compare latency-reduction services like CloudFront, Global Accelerator, and Local Zones.
- Formulate a multi-Region disaster recovery strategy based on RTO and RPO requirements.
Key Terms & Glossary
- Region: A physical location around the world where AWS clusters data centers (e.g.,
us-east-1). - Availability Zone (AZ): One or more discrete data centers with redundant power, networking, and connectivity within an AWS Region.
- Local Zone: An extension of a Region that places compute, storage, and database services closer to large population or industry centers.
- Edge Location: A site that Amazon CloudFront uses to cache copies of your content for faster delivery to users at any location.
- Blast Radius: The maximum impact of a service failure or security breach within a system.
- Zonal Service: A service where resources are tied to a specific AZ (e.g., Amazon EC2, EBS volumes).
- Regional Service: A service that abstracts multiple AZs to provide built-in high availability (e.g., Amazon S3, DynamoDB).
The "Big Idea"
The "Big Idea" of AWS Global Infrastructure is Fault Isolation through Redundancy. By nesting infrastructure (Data Centers → AZs → Regions), AWS allows architects to build workloads where a single failure (like a power outage in one building) does not compromise the entire application. The goal is to move from a single point of failure to a distributed model where the "blast radius" is strictly contained.
Formula / Concept Box
| Attribute | Availability Zone (AZ) | AWS Region | Local Zone |
|---|---|---|---|
| SLA/Reliability | Shared fate within AZ | Isolated from other Regions | Shared fate with parent Region |
| Distance | < 100 km from other AZs | Thousands of km apart | Close to end-user metros |
| Connectivity | Low-latency, redundant fiber | Public internet or Direct Connect | High-bandwidth to parent Region |
| Quantity | Minimum of 3 per Region | 30+ globally | Specific metro areas |
Hierarchical Outline
- I. Global Infrastructure Core Components
- AWS Regions: Geographic isolation, data sovereignty compliance.
- Availability Zones (AZs): Fault-isolated boundaries; connected via low-latency links.
- Data Centers: The physical foundation; redundant power and cooling.
- II. Edge and Latency Services
- Local Zones: For < 10ms latency requirements (e.g., media production, gaming).
- Edge Locations: Entry points for the Amazon Global Edge Network.
- CloudFront vs. Global Accelerator: Content caching vs. network path optimization.
- III. Service Scopes and Fate Sharing
- Zonal Resources: Share the fate of the AZ (e.g., an EC2 instance in
us-east-1a). - Regional Resources: High availability by default (e.g., S3 buckets replicated across all AZs).
- Zonal Resources: Share the fate of the AZ (e.g., an EC2 instance in
- IV. Disaster Recovery (DR) and Multi-Region
- Cross-Region Replication (CRR): S3, RDS Read Replicas, DynamoDB Global Tables.
- Traffic Management: Route 53 (DNS) and Global Accelerator for failover.
Visual Anchors
Infrastructure Hierarchy
Regional vs Zonal Scope
\begin{tikzpicture}[node distance=2cm] \draw[thick, blue, dashed] (0,0) rectangle (8,4) node[pos=0.5, above=1.8cm] {AWS Region}; \draw[thick, gray] (0.5,0.5) rectangle (2.5,3.5) node[below=3.1cm, pos=0.5] {AZ 1}; \draw[thick, gray] (3,0.5) rectangle (5,3.5) node[below=3.1cm, pos=0.5] {AZ 2}; \draw[thick, gray] (5.5,0.5) rectangle (7.5,3.5) node[below=3.1cm, pos=0.5] {AZ 3}; \node[draw, fill=orange!20] at (1.5,2) {EC2 (Zonal)}; \node[draw, fill=green!20, minimum width=6cm] at (4,1) {S3 / DynamoDB (Regional)}; \end{tikzpicture}
Definition-Example Pairs
- Fault Isolation Boundary: A logical or physical separation that prevents a failure in one area from spreading to another.
- Example: Using three AZs for a web fleet so that a fire in one data center doesn't take down the entire website.
- Edge Computing: Processing data closer to the source/user rather than in a centralized cloud Region.
- Example: Running a Lambda@Edge function to customize content for a user based on their geographic location before the request hits the origin.
- Fate Sharing: When a component's availability is tied directly to the availability of the infrastructure it resides on.
- Example: An EBS volume shares the fate of its AZ; if the AZ is unreachable, the volume is unreachable.
Worked Examples
Problem: Designing for "Extreme Availability"
Scenario: A financial application requires 99.999% availability and must survive the total failure of an entire AWS Region (e.g., due to a catastrophic natural disaster).
Step-by-Step Solution:
- Identify Regional Services: Use Amazon S3 and DynamoDB Global Tables to ensure data is asynchronously replicated to a secondary Region.
- Infrastructure as Code (IaC): Use AWS CloudFormation to define the environment so it can be deployed identically in the secondary Region.
- Cross-Region Routing: Deploy AWS Global Accelerator. Assign static IP addresses. Configure the accelerator to route traffic to the primary Region, with an automatic health-check-based failover to the secondary Region.
- Database Strategy: Configure RDS Cross-Region Read Replicas. In a failover scenario, promote the Read Replica to a standalone master.
Checkpoint Questions
- What is the minimum number of Availability Zones in a standard AWS Region?
- How does a Local Zone differ from an Availability Zone in terms of geographic placement?
- True/False: Amazon Route 53 is a zonal service.
- Which service would you use to deliver a non-HTTP application to global users via static IP addresses with low latency?
▶Click to see answers
- Three (3).
- AZs are within the Region (~100km); Local Zones are outside the Region, closer to specific metro areas.
- False. Route 53 is a Global/Regional service (highly available by design).
- AWS Global Accelerator.
Muddy Points & Cross-Refs
- Zonal vs. Regional API Scope: Some services are regional, but you can explicitly target a zone in the CLI (e.g.,
aws ec2 describe-instances --filters Name=availability-zone,Values=eu-west-1a). - Global Accelerator vs. CloudFront: Users often confuse these. Remember: CloudFront is for caching content (images, video). Global Accelerator is for routing traffic over the AWS private network to improve performance for any protocol (TCP/UDP).
- Deep Dive: For more on how this impacts networking, see Chapter 6: Meeting Reliability Requirements.
Comparison Tables
Latency Mitigation Strategies
| Feature | Amazon CloudFront | AWS Global Accelerator | AWS Local Zones |
|---|---|---|---|
| Primary Use Case | Content Delivery (CDN) | Network Path Optimization | Proximity Compute/Storage |
| Protocols | HTTP/HTTPS | TCP / UDP | All AWS Protocols |
| Caching | Yes (Edge caching) | No | No |
| Static IPs | No | Yes (Anycast) | No |