Mastery Guide: Distributed Compute Strategies and Edge Processing

This guide explores how to design high-performance, cost-effective architectures using AWS distributed compute strategies. We cover everything from tightly coupled High-Performance Computing (HPC) to low-latency edge processing with Amazon CloudFront.

Learning Objectives

Distinguish between loosely coupled and tightly coupled distributed systems.
Evaluate the use of Elastic Fabric Adapter (EFA) for low-latency networking.
Explain the role of edge locations and CloudFront in reducing latency.
Determine appropriate compute options (EC2, Lambda, Fargate) for distributed workloads.
Analyze the impact of cluster placement groups on network performance.

Key Terms & Glossary

Edge Processing: The practice of performing data processing at the edge of the network, closer to the source of the data or the user, to reduce latency.
Tightly Coupled: An architecture where instances must work in concert as a single unit, requiring high-speed, low-latency interconnects.
Loosely Coupled: An architecture where components are independent; if one fails, others continue to work. Often managed via queues (e.g., SQS).
Elastic Fabric Adapter (EFA): A network interface for Amazon EC2 instances that enables customers to run applications requiring high levels of inter-node communications.
Cluster Placement Group: A logical grouping of instances within a single Availability Zone that provides low-latency network performance.
Libfabric: An API that allows HPC applications to bypass the OS kernel and communicate directly with the network hardware (EFA).

The "Big Idea"

Distributed computing is the transition from a single "super-server" to a swarm of coordinated resources. By strategically placing compute power—whether physically close together for massive calculations (HPC) or geographically close to the user (Edge)—architects can overcome the physical limits of light-speed and hardware constraints to deliver seamless global experiences.

Formula / Concept Box

Feature	Tightly Coupled (HPC)	Loosely Coupled (Distributed)
Network Need	Ultra-low latency / High throughput	Standard network / Asynchronous
AWS Feature	EFA / Cluster Placement Groups	SQS / SNS / Auto Scaling
Failure Impact	High (one node can stall the cluster)	Low (isolated failures)
Use Case	Weather modeling, CFD, ML training	Web apps, image processing, microservices
Interface	Libfabric / MPI	REST API / Message Queues

Hierarchical Outline

Distributed Compute Fundamentals
- Loose Coupling: Use of asynchronous messaging; independent scaling of components.
- Tight Coupling: Synchronous inter-dependency; requires Cluster Placement Groups.
High-Performance Computing (HPC)
- Elastic Fabric Adapter (EFA): Bypasses TCP/IP stack for better throughput.
- Infrastructure Requirements: All instances must be in the same Subnet and Security Group for EFA.
Edge Processing Strategies
- Amazon CloudFront: Caching static and dynamic content at Edge Locations.
- Lambda@Edge / CloudFront Functions: Executing code closer to the user to modify requests/responses.
Compute Optimization
- Instance Selection: Matching workload to family (e.g., M5 for general, P2 for GPU/ML).
- Serverless: Using Lambda or Fargate to maximize "server density" and cost-efficiency.

Visual Anchors

Edge Processing Flow

Loading Diagram...

Tight vs. Loose Coupling

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Definition-Example Pairs

Term: Partitioning (Sharding)
- Definition: Breaking a large database into smaller, faster, more easily managed parts called data shards.
- Example: Splitting a global user database so that users with IDs 1-10000 are on Database A and 10001-20000 are on Database B to prevent performance bottlenecks.
Term: Edge Caching
- Definition: Storing copies of data at locations geographically closer to users to reduce travel time for data.
- Example: A user in Tokyo accessing a video file stored in a Virginia S3 bucket; CloudFront caches the file in a Tokyo Edge Location for the next local user.
Term: Server Density
- Definition: Maximizing the number of applications or tasks running on a single physical or virtual host.
- Example: Using Docker containers on an Amazon ECS cluster to run 50 microservices on 5 large EC2 instances instead of 50 small instances.

Worked Examples

Example 1: Optimizing for Low-Latency Inter-node Communication

Scenario: A research firm needs to run a fluid dynamics simulation across 20 EC2 instances. Every instance must share its state with all others every few milliseconds.

Solution:
1. Launch instances in a Cluster Placement Group to ensure they are physically close within the data center.
2. Use instance types that support Elastic Fabric Adapter (EFA).
3. Attach the EFA at launch and ensure the Security Group allows all internal traffic.
4. Use the Libfabric API to bypass the OS kernel for data transfer.

Example 2: Global Content Delivery with Edge Logic

Scenario: A streaming service wants to show different advertisements to users based on their country, without adding latency by routing every request to a central server.

Solution:
1. Deploy an Amazon CloudFront distribution.
2. Use Lambda@Edge to intercept the request at the edge location.
3. The Lambda function checks the user's header for location and dynamically modifies the request to fetch the correct localized advertisement from an S3 bucket.

Checkpoint Questions

What is the primary difference between an Elastic Network Adapter (ENA) and an Elastic Fabric Adapter (EFA)?
Why must all instances using an EFA be in the same security group and subnet?
How does loose coupling improve the fault tolerance of a distributed system?
Which AWS service would you use to automatically analyze compute resources and identify potential cost savings over a 14-day period?
[!IMPORTANT] True or False: Tightly coupled workloads can easily be distributed across multiple AWS Regions for higher availability.

▶Click to see answers

EFA supports the Libfabric API and OS bypass, significantly reducing latency for HPC applications compared to standard ENA.
EFA traffic is not routable; it requires the proximity provided by the same subnet and the specific permissions of a unified security group.
In a loosely coupled system (e.g., using SQS), if one component fails, the messages stay in the queue until another component processes them, preventing the entire system from crashing.
AWS Compute Optimizer.
False. Tightly coupled workloads require the ultra-low latency found only within a single Availability Zone (Cluster Placement Group).