Mastering Storage Tiering in AWS: Performance vs. Cost Optimization
Storage tiering
Mastering Storage Tiering in AWS: Performance vs. Cost Optimization
Storage tiering is the strategic process of moving data between different storage categories based on performance requirements, access frequency, and cost objectives. In the AWS ecosystem, this involves selecting the right volume types for EBS, storage classes for S3, and data pools for FSx.
Learning Objectives
After studying this guide, you should be able to:
- Differentiate between block, file, and object storage access patterns.
- Select appropriate EBS volume types (SSD vs. HDD) based on IOPS and throughput needs.
- Automate data lifecycle management in Amazon S3 to minimize costs.
- Configure multi-tier storage in FSx for NetApp ONTAP using primary and capacity pools.
- Identify the trade-offs between storage latency and retrieval costs.
Key Terms & Glossary
- IOPS (Input/Output Operations Per Second): A measure of how many read/write operations a storage device can perform per second. Crucial for databases.
- Throughput: The amount of data transferred over a specific period (e.g., MiB/s). Crucial for big data and log processing.
- WORM (Write Once, Read Many): A data storage device in which information, once written, cannot be modified (e.g., S3 Object Lock).
- Hot Data: Frequently accessed data requiring sub-millisecond latency.
- Cold Data: Infrequently accessed data where cost savings are prioritized over immediate retrieval speed.
The "Big Idea"
[!IMPORTANT] The goal of storage tiering is Economic Efficiency without Performance Degradation.
Think of storage like a home: you keep your daily-use items on the kitchen counter (Hot Storage/SSD), seasonal decorations in the attic (Warm Storage/Infrequent Access), and old tax records in a remote storage unit (Cold Storage/Archival). AWS provides the automation tools to move these items between "rooms" as they age, ensuring you never pay for high-performance "counter space" for items you only need once a year.
Formula / Concept Box
| Component | Calculation Factor |
|---|---|
| Total Cost of Ownership (TCO) | |
| EBS Pricing | Provisioned Capacity (GB) + IOPS/Throughput (if provisioned) + Snapshots |
| S3 Pricing | Storage Class Rate + Requests (PUT/GET) + Data Retrieval + Transfer Out |
Hierarchical Outline
- Block Storage (Amazon EBS)
- SSD-Backed:
gp3(General Purpose),io2(Provisioned IOPS for databases). - HDD-Backed:
st1(Throughput Optimized),sc1(Cold HDD for legacy/archival).
- SSD-Backed:
- Object Storage (Amazon S3)
- Tiers: Standard Standard-IA One Zone-IA Glacier Instant Retrieval Glacier Deep Archive.
- Automation: S3 Lifecycle Policies for transition and expiration.
- File Storage (Amazon EFS & FSx)
- EFS: Elastic throughput; IA storage class for cost savings.
- FSx for NetApp ONTAP: Primary (SSD) vs. Capacity Pool (HDD/Object) auto-tiering.
Visual Anchors
S3 Lifecycle Transition Flow
Performance vs. Cost Trade-off
\begin{tikzpicture}[scale=1.0] % Axes \draw[->] (0,0) -- (6,0) node[right] {Latency (ms)}; \draw[->] (0,0) -- (0,5) node[above] {Cost per GB};
% Data Points
\filldraw[red] (0.5,4.5) circle (2pt) node[anchor=south west] {SSD (io2)};
\filldraw[orange] (1.5,3.0) circle (2pt) node[anchor=south west] {SSD (gp3)};
\filldraw[blue] (4.0,1.5) circle (2pt) node[anchor=south west] {HDD (st1)};
\filldraw[green] (5.5,0.5) circle (2pt) node[anchor=south west] {Glacier};
% Trendline
\draw[dashed, gray] (0.5,4.5) .. controls (1,2) and (3,1) .. (5.5,0.5);\end{tikzpicture}
Definition-Example Pairs
- Provisioned IOPS (
io2): Storage where you specify exactly how many I/O operations per second the volume provides.- Example: A high-traffic Oracle Database requiring 16,000 consistent IOPS to prevent application lag.
- Throughput Optimized HDD (
st1): Low-cost magnetic storage designed for large, sequential data sets.- Example: A Big Data/Hadoop cluster performing MapReduce jobs on terabytes of log files.
- S3 Glacier Deep Archive: The lowest-cost storage class in AWS for long-term retention.
- Example: Storing Compliance Records (e.g., medical images) that must be kept for 10 years but are rarely accessed.
Worked Examples
Scenario: Optimizing an Image Sharing App
Problem: A social app stores millions of images. 90% are never viewed after the first 30 days. Current cost is $5,000/month on S3 Standard.
Step-by-Step Solution:
- Analyze Access Patterns: Identify that "hot" period is 30 days.
- Create Lifecycle Policy:
- Transition objects to S3 Standard-IA after 30 days (saves ~40% on storage).
- Transition objects to S3 Glacier after 1 year for long-term backup.
- Implement S3 Intelligent-Tiering: If access patterns are unpredictable, enable this to let AWS move data automatically between frequent and infrequent tiers without manual lifecycle rules.
- Result: Storage costs drop to ~$2,200/month while maintaining availability.
Checkpoint Questions
- Which EBS volume type is most cost-effective for a streaming workload requiring high throughput but not low-latency random access?
- In FSx for NetApp ONTAP, what is the typical latency for the "Capacity Pool" tier?
- True/False: S3 Standard-IA has a minimum storage duration of 30 days for billing purposes.
- What is the main difference between
gp3andgp2EBS volumes regarding throughput provisioning?
▶Click to see answers
- st1 (Throughput Optimized HDD).
- Tens of milliseconds (compared to sub-millisecond for Primary SSD).
- True.
- gp3 allows you to provision throughput and IOPS independently of storage size; gp2 scales performance based on the size of the volume.
Muddy Points & Cross-Refs
- S3 IA vs. Glacier Instant Retrieval: Both offer millisecond access. The difference is the cost structure—Glacier Instant Retrieval is cheaper for storage but more expensive for data access/retrieval. Use Glacier IR for data accessed once or twice a year.
- st1 vs. sc1: Both are HDDs. Use st1 for active workloads (Big Data). Use sc1 for "Cold" data that just needs to be online (File servers for old projects).
- Cross-Ref: See the "Networking" chapter for Data Transfer Costs, as moving data between regions during tiering can incur significant fees.
Comparison Tables
EBS Volume Type Comparison
| Volume Type | Use Case | Latency | Max Throughput |
|---|---|---|---|
| io2 Block Express | Critical Databases (SAP HANA) | Sub-ms | 4,000 MiB/s |
| gp3 | Virtual Desktops, Boot Volumes | Single-digit ms | 1,000 MiB/s |
| st1 | Data Warehousing, Log Processing | Single-digit ms | 500 MiB/s |
| sc1 | Cold Data, Archival | Tens of ms | 250 MiB/s |
S3 Storage Class Comparison
| Class | Retrieval Time | Durability | Min Storage Duration |
|---|---|---|---|
| Standard | Milliseconds | 99.999999999% | N/A |
| Standard-IA | Milliseconds | 99.999999999% | 30 Days |
| Glacier Flexible | 1 min - 12 hours | 99.999999999% | 90 Days |
| Glacier Deep Archive | 12 - 48 hours | 99.999999999% | 180 Days |