Study Guide920 words

AWS Storage Services: S3, EBS, EFS, and FSx Study Guide

Storage services with appropriate use cases (for example, Amazon S3, Amazon EFS, Amazon EBS)

AWS Storage Services: S3, EBS, EFS, and FSx

This guide covers the core storage options provided by AWS, focusing on their unique characteristics, performance metrics, and appropriate use cases for the SAA-C03 exam.

Learning Objectives

  • Differentiate between Object, Block, and File storage types.
  • Select the appropriate storage service based on access patterns (single vs. multi-instance).
  • Evaluate cost-optimization strategies using lifecycle policies and storage tiering.
  • Identify hybrid storage solutions for connecting on-premises data to the AWS Cloud.

Key Terms & Glossary

  • Object Storage: Data stored as distinct units (objects) with metadata and a unique identifier; highly scalable and accessed via API/HTTP.
  • Block Storage: Data broken into fixed-size blocks; acts like a physical hard drive directly attached to a server (EC2).
  • File Storage (NFS/SMB): Hierarchical data storage accessible by multiple clients simultaneously over a network.
  • IOPS (Input/Output Operations Per Second): A performance metric for block storage measuring how many read/write operations can occur per second.
  • Throughput: The amount of data transferred per second (e.g., MiB/s), critical for large, sequential data transfers.

The "Big Idea"

In AWS, storage selection is a balancing act between Access Pattern and Performance Requirements. If you need shared access across many Linux servers, you use EFS. If you need high-performance, low-latency access for a single database server, you use EBS. If you need to store virtually unlimited images or web assets for a global audience, you use S3. Choosing wrong leads to either performance bottlenecks or unnecessary costs.

Formula / Concept Box

FeatureAmazon S3Amazon EBSAmazon EFS
Storage TypeObjectBlockFile (NFS)
Best Used ForWeb assets, backups, data lakesOS drives, DB volumesShared home dirs, CMS
AccessWeb (HTTP/HTTPS)Attached to 1 EC2 Instance*Shared (1000s of EC2s)
ScopeRegionalAvailability ZoneRegional
Durability99.999999999% (11 9s)99.8% - 99.9%99.999999999% (11 9s)

[!NOTE] Note: EBS Multi-Attach is available for specific Provisioned IOPS volumes, but standard EBS is AZ-locked and single-instance.

Hierarchical Outline

  1. Object Storage (Amazon S3)
    • Buckets & Objects: Global namespace for buckets; objects up to 5TB.
    • Storage Tiers: Standard, Intelligent-Tiering, Standard-IA, One Zone-IA, Glacier (Instant, Flexible, Deep Archive).
    • Lifecycle Management: Automated transitions between tiers to save costs.
  2. Block Storage (Amazon EBS)
    • SSD-backed: gp2/gp3 (General Purpose), io1/io2 (Provisioned IOPS).
    • HDD-backed: st1 (Throughput Optimized), sc1 (Cold HDD).
    • Snapshots: Point-in-time backups stored in S3.
  3. File Storage (EFS & FSx)
    • Amazon EFS: Managed NFS for Linux; scales automatically.
    • Amazon FSx for Windows: Fully managed native Windows file system (SMB).
    • Amazon FSx for Lustre: High-performance for HPC and machine learning.
  4. Hybrid & Migration
    • AWS Storage Gateway: Bridges on-prem to cloud (File, Volume, and Tape gateways).
    • AWS DataSync: High-speed data transfer service.

Visual Anchors

Storage Decision Logic

Loading Diagram...

Block vs. File Connectivity

\begin{tikzpicture}[scale=0.8, every node/.style={transform shape}] % Draw Instances \draw[fill=blue!10] (0,0) rectangle (2,1.5) node[midway] {EC2 A}; \draw[fill=blue!10] (4,0) rectangle (6,1.5) node[midway] {EC2 B};

% EBS Volume (Attached to one) \draw[fill=gray!20] (0,-2) ellipse (1 and 0.5) node {EBS Vol}; \draw[thick, ->] (1,0) -- (1,-1.5); \node at (1.5,-1) {\tiny 1:1};

% EFS (Shared) \draw[fill=green!20] (3,-3) rectangle (5,-2) node[midway] {EFS}; \draw[thick, ->] (1,0) -- (3,-2.5); \draw[thick, ->] (5,0) -- (5,-2.5); \node at (3.5,-1.2) {\tiny Shared}; \end{tikzpicture}

Definition-Example Pairs

  • Term: Amazon S3 Lifecycle Policy

    • Definition: A set of rules that automatically transitions objects to less expensive storage classes or deletes them after a set period.
    • Example: A company stores raw logs in S3 Standard for 30 days, moves them to S3 Glacier for 7 years for compliance, and then automatically deletes them.
  • Term: EBS Snapshot

    • Definition: An incremental backup of an EBS volume, capturing only the blocks that have changed since the last snapshot.
    • Example: Before performing a risky OS upgrade on an EC2 instance, an admin takes a snapshot so they can revert the disk to its exact previous state if the upgrade fails.
  • Term: Amazon FSx for Lustre

    • Definition: A high-performance file system optimized for fast processing of workloads like machine learning and high-performance computing (HPC).
    • Example: A research lab uses FSx for Lustre to feed thousands of images per second from S3 into a GPU-based training cluster for autonomous vehicle AI.

Worked Examples

Example 1: Cost-Optimizing a Large Image Repository

Scenario: A social media app stores 50PB of user photos. Photos are accessed frequently for the first 30 days, then rarely accessed, but must be available instantly if requested.

  • Step 1: Store new uploads in S3 Standard.
  • Step 2: Configure a Lifecycle Policy to transition objects to S3 Standard-Infrequent Access (IA) after 30 days.
  • Step 3: (Alternative) Enable S3 Intelligent-Tiering to allow AWS to automatically move objects between tiers based on changing access patterns without manual management.

Example 2: Selecting Block Storage for a Database

Scenario: You are migrating a high-traffic MySQL database to EC2. The database requires a consistent 15,000 IOPS and sub-millisecond latency.

  • Analysis:
    • gp2 provides 3 IOPS per GB, so a very large volume would be needed to hit 15k.
    • gp3 provides a baseline 3,000 IOPS and can be provisioned higher, but might hit limits.
    • io2 (Provisioned IOPS) is designed for this workload.
  • Solution: Use EBS io2 volumes and provision exactly 15,000 IOPS to ensure performance regardless of volume size.

Checkpoint Questions

  1. Which storage service would you use to provide a shared file system for a fleet of Windows-based web servers?
  2. True or False: S3 is automatically encrypted at rest.
  3. Which EBS volume type is most cost-effective for large, sequential logging workloads that do not require high IOPS?
  4. What is the difference between a File Gateway and a Volume Gateway in AWS Storage Gateway?
  5. How can you ensure that data in an EBS volume is highly available across multiple Availability Zones?
Click to see answers
  1. Amazon FSx for Windows File Server (EFS is Linux-only).
  2. True (As per recent AWS updates and source material).
  3. st1 (Throughput Optimized HDD).
  4. File Gateway provides an NFS/SMB interface to S3; Volume Gateway provides iSCSI block storage backed by S3.
  5. EBS is AZ-specific. To achieve multi-AZ availability, you must take Snapshots and restore them in a different AZ, or use an application-level replication strategy.

Ready to study AWS Certified Solutions Architect - Associate (SAA-C03)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free