Study Guide924 words

AWS Block Storage: EBS and Instance Store Deep Dive

Block storage options (for example, hard disk drive [HDD] volume types, solid state drive [SSD] volume types)

AWS Block Storage: EBS and Instance Store Deep Dive

This study guide explores the nuances of Amazon Elastic Block Store (EBS) and Instance Store volumes, focusing on performance characteristics, cost optimization, and selection criteria for various workloads.

Learning Objectives

By the end of this module, you should be able to:

  • Differentiate between SSD-backed and HDD-backed EBS volumes based on performance requirements.
  • Explain the performance metrics of IOPS versus Throughput.
  • Identify the use cases and risks associated with Instance Store (ephemeral) volumes.
  • Determine the most cost-effective storage type for specific application patterns.

Key Terms & Glossary

  • IOPS (Input/Output Operations Per Second): A measure of how many read/write operations a volume can handle per second. Critical for small, random I/O (e.g., databases).
  • Throughput: The rate at which data is transferred to or from the storage, measured in MB/s or MiB/s. Critical for large, sequential I/O (e.g., big data, log processing).
  • Ephemeral Storage: Temporary storage that exists only for the lifetime of its associated EC2 instance. If the instance is stopped or terminated, data is lost.
  • Block Storage: A type of data storage where data is stored in blocks of a fixed size, accessed directly by the operating system as if it were a physical hard drive.
  • NVMe (Non-Volatile Memory Express): A high-performance interface used for modern SSDs, providing lower latency and higher throughput compared to traditional SATA/SAS interfaces.

The "Big Idea"

Storage in AWS is a fundamental trade-off between performance (latency/speed) and cost. While SSDs provide the low-latency response needed for interactive applications and databases, HDDs offer massive scale and lower costs for data that is processed sequentially. Choosing the right volume is not just about capacity; it is about matching the storage architecture to the data access pattern of the application.

Formula / Concept Box

Performance Calculations

To determine the IOPS needed to sustain a specific throughput, use the following relationship:

IOPS=Throughput (in KB/s)I/O Page Size (in KB)\text{IOPS} = \frac{\text{Throughput (in KB/s)}}{\text{I/O Page Size (in KB)}}

MetricSSD Volume (gp3/io2)HDD Volume (st1/sc1)
Dominant MetricIOPS (Small, random I/O)Throughput (Large, sequential I/O)
Best Use CaseDatabases, Boot volumesBig Data, Log processing
Performance LimitUp to 256,000 IOPS (io2 Block Express)Up to 500 MB/s (st1)

Hierarchical Outline

  • Amazon EBS (Elastic Block Store)
    • SSD-Backed Volumes
      • General Purpose (gp2/gp3): Balanced price/performance. Single-digit millisecond latency.
      • Provisioned IOPS (io1/io2): Highest performance for mission-critical apps. io2 Block Express provides sub-millisecond latency.
    • HDD-Backed Volumes
      • Throughput Optimized (st1): Low cost for frequently accessed, throughput-intensive workloads.
      • Cold HDD (sc1): Lowest cost for infrequently accessed workloads.
  • Instance Store (Ephemeral)
    • Physically attached to the host server.
    • Very high performance (NVMe).
    • Included in the instance price.
    • Warning: Data is lost on instance stop or failure.
  • Volume Management
    • Snapshots: Incremental backups stored in S3.
    • Lifecycle Manager: Automates snapshot creation/retention.
    • Encryption: AES-256 encryption at rest and in transit (KMS integration).

Visual Anchors

Volume Selection Logic

Loading Diagram...

Instance Store vs. EBS Architecture

\begin{tikzpicture} % Host Server \draw[thick, fill=gray!10] (0,0) rectangle (6,4); \node at (3,3.7) {\textbf{Physical Host Server}};

code
% EC2 Instance \draw[blue, thick, fill=white] (0.5,1) rectangle (2.5,3); \node[align=center] at (1.5,2) {EC2\\Instance}; % Instance Store \draw[red, thick] (3.5,1.5) rectangle (5.5,2.5); \node[align=center] at (4.5,2) {Instance\\Store\$Local)}; \draw[<->, thick] (2.5,2) -- (3.5,2) node[midway, above] {NVMe}; % Network \draw[dashed] (6,2) -- (8,2) node[midway, above] {Network}; % EBS \draw[green!60!black, thick, fill=green!5] (8,1) rectangle (10,3); \node[align=center] at (9,2) {EBS\\Volume\$Network)};

\end{tikzpicture}

Definition-Example Pairs

  • Cold HDD (sc1)
    • Definition: The lowest-cost HDD volume type designed for large, infrequently accessed datasets where throughput is more important than IOPS.
    • Example: A file server storing 10-year-old historical logs that are only audited once per quarter.
  • Provisioned IOPS (io2)
    • Definition: High-performance SSD volumes designed for I/O-intensive workloads, particularly those sensitive to storage performance and consistency.
    • Example: A high-traffic SAP HANA database or a Microsoft SQL Server requiring 80,000 IOPS to process transactions.
  • Snapshot
    • Definition: A point-in-time, incremental backup of an EBS volume that is stored in Amazon S3.
    • Example: Before performing a major OS upgrade on an EC2 instance, an administrator takes a snapshot to ensure they can revert to the previous state if the upgrade fails.

Worked Examples

Example 1: Calculating Required IOPS

You are running a MariaDB database on a db.m5 instance. The database uses a 16 KB page size. You require a throughput of 2,000 Mbps. How many IOPS must your storage support?

Step 1: Convert Throughput to MBps 2,000 Mbps÷8=250 MB/s2,000 \text{ Mbps} \div 8 = 250 \text{ MB/s}

Step 2: Convert Page Size to MB 16 KB=0.016 MB16 \text{ KB} = 0.016 \text{ MB}

Step 3: Calculate IOPS IOPS=250 MB/s0.016 MB/operation=15,625 IOPS\text{IOPS} = \frac{250 \text{ MB/s}}{0.016 \text{ MB/operation}} = 15,625 \text{ IOPS}

[!NOTE] A gp2 volume would need to be approximately 5,209 GB to reach this baseline performance, or you could use a gp3 or io2 volume with provisioned performance.

Checkpoint Questions

  1. Which EBS volume type is most cost-effective for a large data warehouse workload that performs heavy sequential reads? (Answer: Throughput Optimized HDD - st1)
  2. True or False: If you stop an EC2 instance, the data on its attached Instance Store volume is preserved. (Answer: False. Instance Store data is lost when an instance is stopped or terminated.)
  3. What happens to EBS volume data if the underlying hardware of an EC2 instance fails? (Answer: Data is preserved. EBS is a network-attached service, and AWS automatically replicates EBS data within its Availability Zone for high durability.)
  4. You need 200,000 IOPS for a massive NoSQL cluster. Which specific EBS volume type should you choose? (Answer: io2 Block Express)

Ready to study AWS Certified Solutions Architect - Associate (SAA-C03)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free