AWS Compute Services: Strategic Selection & Use Cases
AWS compute services with appropriate use cases (for example, AWS Batch, Amazon EMR, AWS Fargate)
AWS Compute Services: Strategic Selection & Use Cases
This study guide covers the essential compute services required for the AWS Certified Solutions Architect - Associate (SAA-C03) exam, focusing on selecting the right service based on performance, cost, and management overhead.
Learning Objectives
By the end of this guide, you should be able to:
- Differentiate between Infrastructure as a Service (EC2), Container Orchestration (ECS/EKS), and Serverless (Lambda/Fargate).
- Identify appropriate use cases for specialized compute services like AWS Batch and Amazon EMR.
- Select the most cost-effective purchasing option (Spot, Reserved, On-Demand) for specific workloads.
- Explain the architectural benefits of decoupling workloads using serverless and containerized patterns.
Key Terms & Glossary
- Serverless: A computing model where the cloud provider manages the infrastructure entirely, and the user only pays for actual execution time (e.g., AWS Lambda, Fargate).
- Container: A standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.
- AMI (Amazon Machine Image): A template that contains a software configuration (operating system, application server, and applications) required to launch an EC2 instance.
- Orchestration: The automated arrangement, coordination, and management of computer systems, middleware, and software (e.g., ECS managing Docker containers).
- Spot Instance: An unused EC2 instance available at a deep discount (up to 90%) that can be reclaimed by AWS with a 2-minute warning.
The "Big Idea"
[!IMPORTANT] The core of AWS architecture is the Compute Continuum. As you move from Amazon EC2 to AWS Lambda, you trade control (operating system access, networking tweaks) for agility (automatic scaling, no server management). A Solutions Architect's primary task is finding the "sweet spot" on this continuum that meets business requirements while minimizing cost and operational toil.
Formula / Concept Box
| Feature | Amazon EC2 | AWS Fargate | AWS Lambda |
|---|---|---|---|
| Management | Customer Managed (IaaS) | AWS Managed (Serverless) | AWS Managed (Function) |
| Scaling | Manual or Auto Scaling | Automatic (Managed) | Highly Elastic (Instant) |
| Pricing | Per Second/Hour (Instance) | Per vCPU and GB (Task) | Per Request and Duration |
| Max Duration | Unlimited | Unlimited | 15 Minutes |
| Use Case | Legacy apps, deep tuning | Microservices, Docker | Event-driven, glue code |
Hierarchical Outline
- Virtual Servers (IaaS)
- Amazon EC2: Full control over OS; suited for long-lived, complex applications.
- Purchasing Options:
- Spot: Best for stateless, fault-tolerant batch jobs.
- Savings Plans/Reserved: Best for predictable, baseline workloads.
- Container Services
- Amazon ECS/EKS: Orchestration for Docker (ECS) and Kubernetes (EKS).
- AWS Fargate: The "Serverless" engine for containers; removes the need to manage EC2 clusters for Docker.
- Serverless Functions
- AWS Lambda: Executes code in response to triggers (S3 uploads, API Gateway, DynamoDB changes).
- Specialized Big Data & Batch
- Amazon EMR: Managed Hadoop/Spark; used for petabyte-scale data processing.
- AWS Batch: Automates the execution of batch computing workloads across EC2 and Fargate.
Visual Anchors
Compute Selection Decision Tree
The Management vs. Control Trade-off
\begin{tikzpicture}[scale=0.8] \draw[thick,->] (0,0) -- (10,0) node[anchor=north] {Management Responsibility (AWS)}; \draw[thick,->] (0,0) -- (0,6) node[anchor=east] {Control (Customer)};
\node[draw, fill=blue!10] at (1.5,5) {EC2};
\node[draw, fill=green!10] at (4.5,3.5) {ECS / EKS};
\node[draw, fill=orange!10] at (7.5,2) {Fargate};
\node[draw, fill=red!10] at (9,1) {Lambda};
\draw[dashed] (1.5,5) -- (9,1);
\node at (5,5) {\small \textit{Inverse Relationship}};\end{tikzpicture}
Definition-Example Pairs
-
AWS Batch
- Definition: A regional service that simplifies running batch computing workloads by dynamically provisioning the right type of compute resource (EC2 or Fargate) based on the volume and requirements of the submitted jobs.
- Real-World Example: A financial institution running night-end market analyses that require massive parallel processing but only for 2 hours every night.
-
Amazon EMR (Elastic MapReduce)
- Definition: A cloud-native big data platform that uses open-source tools like Apache Spark, Hive, and Presto to process and analyze vast amounts of data.
- Real-World Example: A genomics research firm analyzing millions of DNA sequences to identify genetic markers.
-
AWS Fargate
- Definition: A serverless, pay-as-you-go compute engine for containers that works with both Amazon ECS and Amazon EKS.
- Real-World Example: A web startup running a microservices-based API where they want to focus on Docker code without ever patching a Linux server.
Worked Examples
Scenario 1: Cost-Optimizing a Fault-Tolerant Job
Requirement: A company needs to process 10,000 images every weekend. The processing is stateless and can be restarted if interrupted. They want the lowest cost possible.
- Solution: Use AWS Batch configured with EC2 Spot Instances.
- Reasoning: Spot Instances offer the lowest price (up to 90% off). Since the job is stateless and weekend-only, the potential for interruption is acceptable in exchange for the cost savings.
Scenario 2: Event-Driven Architecture
Requirement: A user uploads a video to S3. The system must immediately trigger a process to create a thumbnail and notify a database.
- Solution: AWS Lambda.
- Reasoning: Lambda is the perfect "glue" for event-driven tasks. It scales instantly to the number of uploads and requires zero infrastructure management for a simple task that takes seconds to complete.
Checkpoint Questions
- What is the maximum execution time for an AWS Lambda function?
- Which service is specifically designed for big data frameworks like Apache Spark and Hive?
- If you require full root access to the underlying operating system, which compute service should you choose?
- What is the primary difference between ECS on EC2 and ECS on Fargate?
- Which EC2 purchasing option is best suited for a steady-state database workload that will run for at least one year?
▶Click to see answers
- 15 Minutes.
- Amazon EMR.
- Amazon EC2.
- With ECS on EC2, you manage the cluster of servers; with Fargate, AWS manages the underlying infrastructure.
- Reserved Instances or Savings Plans.