Study Guide850 words

AWS Storage Performance and Configuration Guide

Determining storage services and configurations that meet performance demands

AWS Storage Performance and Configuration Guide

This guide explores how to determine the optimal storage services and configurations to meet specific performance demands within the AWS ecosystem, focusing on IOPS, throughput, and service selection.

Learning Objectives

  • Differentiate between block, file, and object storage performance characteristics.
  • Calculate required IOPS based on application throughput and database page sizes.
  • Implement RAID configurations on EBS to bypass single-volume performance limits.
  • Select the appropriate storage tier (S3, EBS, EFS, FSx) based on workload access patterns.

Key Terms & Glossary

  • IOPS (Input/Output Operations Per Second): A measurement of the number of reads and writes a storage device can perform per second.
  • Throughput: The amount of data (usually in MB/s) transferred to or from a storage device in a given time.
  • Latency: The time it takes for a single I/O request to be completed (measured in milliseconds).
  • Striping (RAID 0): A method of spreading data across multiple disks to increase performance by allowing parallel I/O.
  • Object Storage: Data stored as distinct objects (S3), ideal for massive scaling but with higher latency than block storage.
  • Block Storage: Data stored in fixed-sized blocks (EBS), ideal for high-performance databases and low-latency needs.

The "Big Idea"

Storage performance in the cloud is not just about "size"; it is about matching the Access Pattern to the Media Type. A high-performance architecture treats storage as a tunable resource where IOPS and throughput can be scaled independently or through software-defined methods (like RAID) to eliminate bottlenecks.

Formula / Concept Box

ConceptFormula / RuleNotes
Throughput CalculationThroughput=IOPS×I/O Size\text{Throughput} = \text{IOPS} \times \text{I/O Size}I/O Size is often determined by the DB page size (8KB or 16KB).
RAID 0 (Striping)Ptotal=Pdisk×nP_{total} = P_{disk} \times nPerformance (P)increaseslinearlywiththenumberofdisks(nP) increases linearly with the number of disks (n). No redundancy.
MySQL/MariaDB Page16 KBLarger pages require fewer IOPS for the same throughput.
PostgreSQL/Oracle Page8 KBSmaller pages require more IOPS for the same throughput.

Hierarchical Outline

  • I. Understanding Performance Metrics
    • IOPS vs. Throughput: Choosing between high-frequency small transactions (IOPS) and large data transfers (Throughput).
    • Latency Requirements: Low-latency needs usually dictate EBS (Block) or Instance Store.
  • II. EBS Performance Optimization
    • EBS-Optimized Instances: Dedicated bandwidth for storage traffic to prevent network contention.
    • RAID Configurations: Using RAID 0 for maximum speed; avoiding RAID 1 on AWS due to unnecessary bandwidth consumption.
  • III. Scalable Storage Solutions
    • Amazon S3: Scalable object storage; use Transfer Acceleration or Multi-part uploads for performance.
    • Amazon EFS/FSx: Shared file systems for distributed workloads (Linux vs. Windows/Lustre).
  • IV. Database Storage Tuning
    • Page Size Impact: How engine-specific page sizes (8KB vs 16KB) affect I/O credit consumption.

Visual Anchors

Storage Selection Flowchart

Loading Diagram...

RAID 0 Data Striping

\begin{tikzpicture} \draw[thick] (0,2) rectangle (2,3) node[midway] {Block A}; \draw[thick] (3,2) rectangle (5,3) node[midway] {Block B}; \draw[thick] (0,0.5) rectangle (2,1.5) node[midway] {Block C}; \draw[thick] (3,0.5) rectangle (5,1.5) node[midway] {Block D}; \node at (1,3.3) {Disk 1}; \node at (4,3.3) {Disk 2}; \draw[->, thick] (2.5,4.5) -- (1,3.5); \draw[->, thick] (2.5,4.5) -- (4,3.5); \node at (2.5,4.8) {Data Stream split into stripes}; \end{tikzpicture}

Definition-Example Pairs

  • EBS-Optimized Instance: An EC2 instance type that provides dedicated throughput to EBS volumes.
    • Example: Running a high-traffic SQL Server on an m5.4xlarge to ensure the network traffic from users doesn't starve the disk I/O traffic.
  • Ephemeral Storage (Instance Store): Temporary block-level storage physically attached to the host server.
    • Example: Using Instance Store for a swap file or a NoSQL database cluster (like MongoDB) that handles its own data replication.
  • S3 Lifecycle Policy: Rules to automatically move or delete objects over time.
    • Example: Moving 90-day-old logs from S3 Standard to S3 Glacier Deep Archive to save 90% in costs without deleting data.

Worked Examples

Problem: Calculating IOPS for a Throughput Target

A Solutions Architect is designing a PostgreSQL database that needs to support a read throughput of 200 MB/s. PostgreSQL uses an 8 KB page size. How many IOPS must the EBS volume support?

Step 1: Convert Throughput to KB. $200 MB/s \times 1024 = 204,800 KB/s$.

Step 2: Divide by the page size. $204,800 \text{ KB/s} \div 8 \text{ KB/Page} = 25,600 \text{ Pages/second}$.

Step 3: Result. Since each page read is one I/O operation, the volume needs 25,600 IOPS.

[!TIP] If this were MySQL (16 KB pages), the required IOPS would only be 12,800 for the same 200 MB/s throughput!

Checkpoint Questions

  1. Why does AWS generally recommend against using RAID 1 on EBS volumes?
    • Answer: EBS is already replicated within an Availability Zone for durability. RAID 1 mirrors data again, doubling the network traffic between the EC2 instance and EBS without providing significant benefit.
  2. Which RAID configuration should you use to increase performance by striping data across multiple volumes?
    • Answer: RAID 0.
  3. An application requires sub-millisecond latency and high random I/O. Should you use S3 or EBS?
    • Answer: EBS (specifically Provisioned IOPS SSD - io2). S3 is object storage and typically has higher latency (tens of milliseconds).
  4. What is the primary difference between EFS and FSx for Lustre?
    • Answer: EFS is a general-purpose Linux file system. FSx for Lustre is a high-performance file system optimized for compute-heavy workloads like machine learning and video processing.

Ready to study AWS Certified Solutions Architect - Associate (SAA-C03)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free