AWS Compute Selection: Optimizing for Performance and Cost

This study guide covers the critical decision-making process for selecting AWS compute resources, focusing on matching business requirements to EC2 instance types, purchasing models, and modern compute paradigms like containers and serverless.

Learning Objectives

By the end of this guide, you should be able to:

Differentiate between the five main EC2 instance families (General Purpose, Compute, Memory, Accelerated, and Storage Optimized).
Select the most cost-effective purchasing option (On-Demand, Spot, Reserved, or Savings Plans) based on workload stability.
Evaluate when to use Amazon EC2, AWS Lambda, or AWS Fargate for specific architectural patterns.
Apply scaling strategies including horizontal/vertical scaling and EC2 hibernation.

Key Terms & Glossary

AMI (Amazon Machine Image): A template that contains the software configuration (operating system, application server, and applications) required to launch your instance.
vCPU: A unit of capacity representing a portion of a physical CPU, typically mapped to a hardware hyper-thread.
Horizontal Scaling: Adding more instances to a fleet to handle increased load (scaling out).
Vertical Scaling: Increasing the hardware specifications (CPU/RAM) of a single existing instance (scaling up).
IOPS (Input/Output Operations Per Second): A performance metric used to measure the speed of storage devices (SSD vs. HDD).

The "Big Idea"

[!IMPORTANT] The core of AWS compute selection is Right-Sizing. This is the continuous process of matching instance types and sizes to your workload performance and capacity requirements at the lowest possible cost. Think of it as the "Goldilocks" principle: you want a resource that is not too powerful (waste of money) and not too weak (performance bottleneck), but just right.

Formula / Concept Box

Purchase Option	Best Use Case	Cost Benefit
On-Demand	Spiky workloads, short-term, or unpredictable	No upfront commitment; pay by the second.
Spot Instances	Fault-tolerant, stateless, or batch jobs	Up to 90% discount compared to On-Demand.
Reserved (RI)	Steady-state, predictable usage (1 or 3 years)	Up to 75% discount; can be Standard or Convertible.
Savings Plans	Consistent compute usage across EC2, Lambda, Fargate	Flexible discount for a committed $ amount per hour.

Hierarchical Outline

I. Amazon EC2 Instance Families
- General Purpose (M, T): Balanced CPU, memory, and networking.
- Compute Optimized (C): High-performance processors; ideal for batch processing and web servers.
- Memory Optimized (R, X, z): High RAM capacity; used for in-memory databases (SAP HANA).
- Accelerated Computing (P, G, Inf): Hardware accelerators (GPUs/FPGAs) for ML and graphics.
- Storage Optimized (I, D, H): High sequential read/write and local storage; ideal for NoSQL.
II. Scaling and Elasticity
- Auto Scaling Groups (ASG): Automates the addition/removal of instances based on metrics.
- Hibernation: Saves the RAM state to EBS; allows for faster startup than a cold boot.
III. Serverless and Containers
- AWS Lambda: Event-driven, zero-administration compute (runs up to 15 mins).
- AWS Fargate: Serverless engine for containers (ECS/EKS) where you don't manage the underlying EC2.

Visual Anchors

Instance Selection Decision Tree

Loading Diagram...

Horizontal vs. Vertical Scaling

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Definition-Example Pairs

Spot Instance: A purchasing option that allows you to use spare AWS capacity at a steep discount.
- Example: Running a 12-hour data transcoding job that can be paused and resumed if AWS needs the capacity back.
AWS Compute Optimizer: A service that uses machine learning to analyze historical utilization metrics.
- Example: A report suggesting you move from an m5.large to a t3.medium because your average CPU utilization is only 5%.
Placement Groups (Cluster): Logical grouping of instances within a single Availability Zone for low-latency networking.
- Example: A High-Performance Computing (HPC) workload where nodes need to communicate at 100 Gbps.

Worked Examples

Scenario 1: The Predictable Web App

Problem: A company runs a legacy web application that requires exactly 4 instances 24/7 to handle baseline traffic, but spikes to 10 instances during the first week of every month. Solution:

Purchase Reserved Instances or a Savings Plan for the 4 baseline instances to maximize cost savings.
Use On-Demand Instances with an Auto Scaling Group for the additional 6 instances during the monthly spike.

Scenario 2: High-Performance In-Memory Cache

Problem: A developer needs to deploy a Redis cluster that requires 256GB of RAM to store a session database. Solution: Select a Memory Optimized instance, specifically from the R-family (e.g., r6g.4xlarge). Since the cache is critical but can be reconstructed, use On-Demand unless the application can handle the 2-minute interruption notice of a Spot instance.

Checkpoint Questions

Which EC2 family is most appropriate for a high-traffic frontend web server with balanced resource needs?
- (Answer: M-family / General Purpose)
You have a stateless batch processing job that can be interrupted. Which pricing model offers the lowest cost?
- (Answer: Spot Instances)
True or False: An AMI created in the us-east-1 region can be used directly to launch an instance in eu-west-1.
- (Answer: False; AMIs are region-specific and must be copied to the target region first.)
What feature allows an EC2 instance to save its memory state to disk and stop, allowing for faster resumption later?
- (Answer: Hibernation)