AWS Compute Selection and Cost Optimization
Determining cost-effective AWS compute services with appropriate use cases (for example, AWS Lambda, Amazon EC2, AWS Fargate)
AWS Compute Selection and Cost Optimization
This guide covers the critical task of determining the most cost-effective AWS compute services based on specific workload requirements, focusing on Amazon EC2, AWS Fargate, and AWS Lambda.
Learning Objectives
- Evaluate the trade-offs between IaaS (EC2), CaaS (Fargate), and FaaS (Lambda).
- Identify the most cost-effective purchasing options (Spot, Reserved, Savings Plans) for different workload types.
- Select the appropriate EC2 instance family and size based on resource needs (Compute, Memory, Storage).
- Determine the optimal scaling strategy to align compute capacity with fluctuating demand.
Key Terms & Glossary
- IaaS (Infrastructure as a Service): A cloud model providing virtualized computing resources over the internet (e.g., Amazon EC2).
- Serverless: A cloud computing execution model where the provider runs the server and dynamically manages the allocation of machine resources (e.g., AWS Lambda, Fargate).
- Provisioned Capacity: Resources that are pre-allocated and paid for regardless of whether they are fully utilized.
- Compute Optimizer: An AWS service that uses machine learning to analyze historical utilization metrics and recommend optimal AWS resources.
- Cold Start: The delay that occurs when a serverless function (Lambda) is invoked after being idle, requiring a new execution environment to be initialized.
The "Big Idea"
[!IMPORTANT] The goal of cost-effective compute design is to find the highest level of abstraction that meets your technical requirements while minimizing undifferentiated heavy lifting (managing servers) and idle capacity (paying for what you don't use).
Formula / Concept Box
| Service | Pricing Dimension | Best For... |
|---|---|---|
| Amazon EC2 | Per second/hour (Instance Type + OS) | Long-running, predictable, or complex legacy apps. |
| AWS Fargate | Per vCPU and GB per hour | Containerized microservices without server management. |
| AWS Lambda | Per request + duration (memory-weighted) | Intermittent, event-driven, or short-lived tasks. |
Hierarchical Outline
- Amazon EC2 (Infrastructure Control)
- Instance Families: General Purpose (M), Compute Optimized (C), Memory Optimized (R), Storage Optimized (I/D), Accelerated (P/G).
- Purchasing Options:
- On-Demand: No commitment, highest price.
- Reserved Instances (RI): 1-3 year commitment for steady-state (up to 72% discount).
- Savings Plans: Flexible commitment across EC2, Fargate, and Lambda.
- Spot Instances: Spare capacity for fault-tolerant jobs (up to 90% discount).
- AWS Fargate (Container Abstraction)
- Serverless Containers: Removes the need to manage EC2 clusters for ECS or EKS.
- Cost Efficiency: Pay only for the resources required by the container, not the host VM.
- AWS Lambda (Function-as-a-Service)
- Event-Driven: Triggers from S3, DynamoDB, API Gateway, etc.
- Execution Limits: Max 15-minute runtime; 128MB to 10GB memory.
- Cost Model: Billed in 1ms increments; zero cost when not running.
Visual Anchors
Compute Selection Logic
Responsibility Layers
\begin{tikzpicture} \draw[thick, fill=blue!10] (0,0) rectangle (6,1) node[midway] {Hardware / Global Infrastructure (AWS)}; \draw[thick, fill=green!10] (0,1.2) rectangle (6,2.2) node[midway] {Hypervisor / Virtualization (AWS)}; \draw[thick, fill=orange!10] (0,2.4) rectangle (6,3.4) node[midway] {Guest OS / Runtime (User in EC2 / AWS in Lambda)}; \draw[thick, fill=red!10] (0,3.6) rectangle (6,4.6) node[midway] {Application Code (User)}; \node at (3,5.2) {\textbf{The Shared Responsibility Layers}}; \end{tikzpicture}
Definition-Example Pairs
- Spot Instances: Purchasing unused EC2 capacity at steep discounts.
- Example: A data scientist running a massive 12-hour batch processing job that can be paused and resumed if AWS needs the capacity back.
- Compute Optimized Instances (C-Family): Instances designed for compute-bound applications that benefit from high-performance processors.
- Example: High-performance web servers, scientific modeling, or dedicated gaming servers.
- Event-Driven Computing: A model where code is executed in response to specific system state changes.
- Example: A Lambda function that automatically generates a thumbnail whenever a user uploads a high-resolution photo to an S3 bucket.
Worked Examples
Scenario 1: The Seasonal E-Commerce Site
Problem: A retailer has a steady 100 users but spikes to 10,000 during Black Friday. Solution: Use EC2 Reserved Instances or Savings Plans for the baseline 100-user load (steady-state). Use EC2 Auto Scaling with On-Demand or Spot instances (if the app is stateless) for the seasonal spike to handle the extra 9,900 users cost-effectively.
Scenario 2: High-Frequency Microservices
Problem: A startup is building a microservices architecture. They want to avoid the operational overhead of patching Linux kernels but need the environment to scale instantly. Solution: Use AWS Fargate. It allows them to define vCPU and Memory at the container level. They pay only for the exact duration the container runs, avoiding the "bin-packing" problem of fitting containers onto fixed-size EC2 instances.
Checkpoint Questions
- Which compute service should you choose if your code runs for 20 minutes and requires specific kernel-level kernel modules? (Answer: EC2, as Lambda has a 15-minute limit and Fargate doesn't allow kernel modification).
- What is the most cost-effective purchasing option for a stateless, fault-tolerant background processing application? (Answer: Spot Instances).
- True or False: AWS Fargate requires you to manage the underlying EC2 instances for your ECS cluster. (Answer: False; Fargate is the serverless way to run containers).
- Which AWS tool should you use to receive automated recommendations for resizing over-provisioned EC2 instances? (Answer: AWS Compute Optimizer).