Optimizing Infrastructure: Selection & Rightsizing for Cost-Efficiency
Identifying opportunities to select and rightsize infrastructure for cost-effective resources
Optimizing Infrastructure: Selection & Rightsizing for Cost-Efficiency
This guide covers the critical skills required for the AWS Certified Solutions Architect - Professional (SAP-C02) exam regarding cost optimization through infrastructure selection and rightsizing.
Learning Objectives
After studying this guide, you should be able to:
- Analyze resource utilization data to identify over-provisioned and under-provisioned resources.
- Select appropriate instance families and generations based on workload characteristics.
- Evaluate the cost-benefit of migrating from x86 to ARM-based (AWS Graviton) architectures.
- Implement storage tiering strategies to align costs with data access patterns.
- Utilize AWS tools like Compute Optimizer and Trusted Advisor to automate rightsizing recommendations.
Key Terms & Glossary
- Rightsizing: The process of matching instance types and sizes to your workload performance and capacity requirements at the lowest possible cost.
- AWS Graviton: Custom-built ARM-based processors that typically offer up to 40% better price-performance over comparable x86-based instances.
- Orphaned Resources: Provisioned resources (e.g., unattached EBS volumes, idle Elastic IPs) that are no longer in use but still incur costs.
- Modernization: The shift from IaaS (EC2) to higher-level managed services (Fargate, Lambda) to reduce operational overhead and optimize costs.
- Compute Optimizer: A service that uses machine learning to analyze historical utilization metrics and recommend optimal AWS resources.
The "Big Idea"
In a traditional data center, "over-provisioning" is a safety net. In the cloud, it is a financial leak. Cost optimization is not a one-time event but a continuous lifecycle of monitoring, analyzing, and adjusting. The goal is to reach a state of "Elasticity Equilibrium," where the provisioned capacity perfectly mirrors the actual demand curve, minimizing waste without sacrificing performance.
Formula / Concept Box
| Concept | Metric / Rule of Thumb |
|---|---|
| Utilization Target | Aim for 60%–80% CPU/Memory utilization for steady-state workloads. |
| The "N+1" Rule | For cost-efficiency, use the latest generation (e.g., move C5 to C6) to get better performance for a lower hourly rate. |
| Storage Savings | Moving from S3 Standard to S3 Intelligent-Tiering can save ~20-40% if access patterns are unknown. |
| Graviton Savings |
Hierarchical Outline
- I. Rightsizing Compute Resources
- Instance Family Selection: Matching workload (e.g., Memory-Intensive) to the correct family (e.g., R-family).
- Generation Upgrades: Moving from older (M4/M5) to newer (M6/M7) instances.
- Architecture Shifting: Migrating to AWS Graviton (ARM) for improved price-performance.
- II. Storage Optimization
- Tiering Strategy: Moving from EBS gp2 to gp3 (better performance/cost) or S3 Standard to Glacier.
- Volume Rightsizing: Reducing provisioned IOPS or size to match actual usage.
- III. Operational & Application Improvements
- Managed Services: Refactoring to AWS Lambda or Fargate to eliminate idle server costs.
- Decommissioning: Using Tagging and CloudWatch to identify and terminate orphaned resources.
- IV. Monitoring and Automation Tools
- AWS Compute Optimizer: ML-based rightsizing recommendations.
- AWS Trusted Advisor: Identification of idle RDS instances and unassociated Elastic IPs.
Visual Anchors
The Rightsizing Lifecycle
Resource Provisioning vs. Demand
\begin{tikzpicture} [node distance=2cm] \draw[->] (0,0) -- (6,0) node[right] {Time}; \draw[->] (0,0) -- (0,4) node[above] {Capacity};
% Demand curve \draw[blue, thick] (0,1) .. controls (1,3) and (2,0.5) .. (3,2.5) .. controls (4,3.5) and (5,1) .. (6,2); \node[blue] at (5.5, 2.5) {Demand};
% Over-provisioned line \draw[red, dashed] (0,3.5) -- (6,3.5); \node[red] at (3, 3.8) {Over-provisioned (Waste)};
% Rightsized/Elastic line \draw[green, thick] (0,1.5) -- (1,1.5) -- (1,3.5) -- (2,3.5) -- (2,1.5) -- (3.5,1.5) -- (3.5,3.8) -- (5,3.8) -- (5,2) -- (6,2); \node[green] at (1, 0.5) {Rightsized/Elastic}; \end{tikzpicture}
Definition-Example Pairs
- Compute-Intensive (C-Family): Optimized for high CPU performance per core.
- Example: A video encoding application that requires heavy math operations but low memory should use c6g.xlarge.
- Memory-Intensive (R-Family): Optimized for large datasets in memory.
- Example: An in-memory database like Redis or Memcached should be placed on an r6i.2xlarge.
- Burstable Performance (T-Family): Baseline performance with the ability to burst above the baseline.
- Example: A low-traffic internal wiki or a dev environment that stays idle most of the day should use t3.micro.
Worked Examples
Case Study: The Over-Provisioned Web Server
Scenario: A company runs a fleet of 10 m5.2xlarge instances for a web app. CloudWatch shows average CPU utilization is consistently below 10% and RAM usage is below 15%.
Step 1: Analyze Metrics
- Instance: m5.2xlarge (8 vCPU, 32 GiB RAM)
- Current Cost: ~$0.384/hr per instance.
- Actual Need: < 1 vCPU and < 4 GiB RAM per instance.
Step 2: Selection
- Option A: Downsize within family to m5.large (2 vCPU, 8 GiB).
- Option B (Better): Move to t3.medium (2 vCPU, 4 GiB) if bursting is acceptable.
- Option C (Best Performance/Cost): Move to m6g.medium (2 vCPU, 4 GiB - Graviton).
Step 3: Outcome
- Moving from m5.2xlarge ($0.384/hr) to m6g.medium ($0.0385/hr) results in a ~90% cost reduction while still providing 2x the headroom needed for current peaks.
Checkpoint Questions
- Which AWS service would you use to get a consolidated list of over-provisioned EC2 instances across your entire organization using Machine Learning?
- You are migrating a legacy Java application from an m5.large to an m6g.large. What is the primary hardware difference you must account for?
- True or False: Decommissioning an EC2 instance automatically deletes the associated EBS root volume by default.
- What is the main financial advantage of moving from EBS gp2 to gp3 volumes?
Muddy Points & Cross-Refs
[!WARNING] The Graviton Trap: While Graviton (ARM) is cheaper, it is not a "click to upgrade" process. Applications must be recompiled for the ARM64 architecture. If your application relies on x86-specific binaries or libraries, the refactoring cost may outweigh the immediate savings.
Cross-References:
- For more on storage tiering, see Domain 3: Storage Design.
- For more on commit-based discounts, see Chapter 4: Savings Plans and RIs.
Comparison Tables
x86 vs. ARM (Graviton)
| Feature | x86 (Intel/AMD) | ARM (AWS Graviton) |
|---|---|---|
| Availability | All Regions / All Instance Sizes | Selected Regions / Newer Generations |
| Compatibility | Universal (Windows/Linux) | Mostly Linux / Needs ARM binaries |
| Price | Standard | ~20% Lower than x86 |
| Performance | High per-core performance | Better price-performance ratio |
Rightsizing Tools Comparison
| Tool | Primary Use Case | Key Benefit |
|---|---|---|
| Compute Optimizer | Instance/Lambda/EBS sizing | ML-driven, looks at 14 days of data |
| Cost Explorer | High-level spend analysis | Identifies trends and RIs/Savings Plans |
| Trusted Advisor | Security & Cost Hygiene | Checks for idle resources (EIPs, ELBs) |
| CloudWatch | Real-time monitoring | Granular data for custom scaling policies |