Design High-Performing and Elastic Compute Solutions
Design high-performing and elastic compute solutions
Design High-Performing and Elastic Compute Solutions
This guide focuses on Domain 3 of the AWS SAA-C03 exam: architecting compute resources that scale dynamically with demand while maintaining peak performance efficiency.
Learning Objectives
After studying this module, you should be able to:
- Differentiate between EC2, ECS, and Lambda for specific workload performance requirements.
- Select appropriate instance types and configurations (RAM, CPU, Network) to resolve compute bottlenecks.
- Implement horizontal and vertical scaling strategies using Amazon EC2 Auto Scaling.
- Optimize containerized workloads using AWS Fargate for serverless container orchestration.
- Configure event-driven compute patterns using AWS Lambda and API Gateway.
Key Terms & Glossary
- Elasticity: The ability of a system to dynamically grow or shrink infrastructure resources based on real-time demand.
- Vertical Scaling (Scaling Up): Increasing the capacity of a single resource, such as upgrading an EC2 instance to a larger family (e.g.,
t3.microtom5.large). - Horizontal Scaling (Scaling Out): Adding more resources of the same size to a pool, such as adding more instances to an Auto Scaling Group (ASG).
- Serverless: A cloud execution model where the provider (AWS) manages the server allocation and infrastructure (e.g., Lambda, Fargate).
- AMI (Amazon Machine Image): A template that contains the software configuration (operating system, application server, and applications) required to launch an instance.
- Orchestration: The automated arrangement, coordination, and management of complex computer systems, specifically containers (ECS/EKS).
The "Big Idea"
[!IMPORTANT] The core philosophy of high-performing compute in AWS is Right-Sizing and Decoupling. Performance is not just about the fastest processor; it is about matching the compute profile to the workload's specific bottleneck (Compute, Memory, or I/O) and ensuring components can scale independently to avoid single points of failure.
Formula / Concept Box
| Compute Service | Best For... | Scaling Mechanism | Management Overhead |
|---|---|---|---|
| EC2 | Long-lived, complex apps | Auto Scaling Groups (ASG) | High (OS, Patches) |
| ECS / EKS | Microservices, Docker | Task Scaling / Fargate | Medium (Container orchestration) |
| Lambda | Event-driven, short tasks | Automatic (Per request) | Low (Serverless) |
Hierarchical Outline
- I. AWS Compute Foundations
- A. EC2 Instances: Virtual machines with full OS control.
- B. AMIs & Image Builder: Automating the creation of optimized gold images.
- II. Serverless & Containers
- A. AWS Lambda: Code-only execution (max 15 mins); integrated with S3, DynamoDB, and API Gateway.
- B. Amazon ECS/EKS: Orchestrating Docker containers.
- C. AWS Fargate: Serverless compute engine for containers (no EC2 management).
- III. Elasticity & Scaling
- A. Auto Scaling Groups: Maintaining availability by matching instance counts to load.
- B. Scaling Metrics: CPU utilization, Memory, or Custom CloudWatch metrics.
- IV. Performance Monitoring
- A. CloudWatch: Tracking metrics and logs to identify bottlenecks.
Visual Anchors
Elastic Scaling Flow
Performance Curve Visualization
Definition-Example Pairs
- Compute Optimized Instances: Instances designed for compute-bound applications that benefit from high-performance processors.
- Example: High-performance web servers, batch processing workloads, and scientific modeling.
- Event-Driven Architecture: A design pattern where actions are triggered by specific changes in state or events.
- Example: An image is uploaded to an S3 bucket (event), which triggers a Lambda function to create a thumbnail (action).
- Loose Coupling: Designing components so they can interact without being dependent on each other's internal implementation.
- Example: Using an SQS queue between a web front-end and a processing back-end so the back-end can scale independently without dropping requests.
Worked Examples
Scenario: Handling Spiky Web Traffic
Problem: A retail website experiences massive traffic spikes every morning at 9:00 AM but remains nearly idle at night. The current single large EC2 instance crashes during spikes.
Step-by-Step Solution:
- Identify Bottleneck: CloudWatch shows CPU hitting 100% during spikes.
- Create an AMI: Capture the current server state into a reusable Amazon Machine Image.
- Launch Configuration/Template: Define the instance type (e.g.,
t3.medium) and the AMI to use. - Auto Scaling Group (ASG): Set a minimum of 2 instances (for high availability) and a maximum of 10.
- Scaling Policy: Implement a Target Tracking policy set to "Average CPU Utilization = 70%".
- Load Balancer (ALB): Place an Application Load Balancer in front of the ASG to distribute traffic.
Checkpoint Questions
- What is the maximum execution time for a single AWS Lambda function? (Answer: 15 minutes)
- Which service allows you to run containers without managing the underlying EC2 instances? (Answer: AWS Fargate)
- If you need to perform a "busy work" command to test your Auto Scaling policy, what command can you run? (Answer:
while true; do true; done) - Which EC2 feature allows you to change instance types (e.g., from
t3tom5) to address a memory bottleneck? (Answer: Vertical Scaling / Resizing) - How do you ensure that Auto Scaling instances are accessible via a single URL? (Answer: Use an Application Load Balancer)