Mastering AWS Storage Types: Object, Block, and File
Storage types with associated characteristics (for example, object, file, block)
Mastering AWS Storage Types: Object, Block, and File
This study guide explores the fundamental differences between cloud storage architectures and their specific implementations within the AWS ecosystem, aligned with the SAA-C03 curriculum.
Learning Objectives
After studying this guide, you should be able to:
- Differentiate between Object, Block, and File storage architectures.
- Map specific AWS services (S3, EBS, EFS, FSx) to their respective storage categories.
- Identify the correct storage type based on requirements for latency, scalability, and shared access.
- Understand performance optimization techniques like RAID and storage tiering.
Key Terms & Glossary
- Amazon EBS (Elastic Block Store): High-performance block storage designed for use with Amazon EC2.
- Amazon S3 (Simple Storage Service): Scalable object storage used for backups, analytics, and static web hosting.
- Amazon EFS (Elastic File System): Serverless, fully managed file storage for Linux-based workloads.
- IOPS (Input/Output Operations Per Second): A measure of performance for storage devices like SSDs and HDDs.
- POSIX: A family of standards for maintaining compatibility between operating systems; EFS is POSIX-compliant.
- Metadata: Data about data (e.g., file size, creator, permissions). S3 allows up to 2 KB of custom metadata.
The "Big Idea"
In traditional computing, storage was simply a hard drive attached to a server. In the cloud, storage is disaggregated and specialized. Choosing the right storage is not just about capacity; it is a trade-off between how the OS sees the data (Block), how many users share the data (File), and how much metadata you need to manage (Object).
Formula / Concept Box
| Characteristic | Object (S3) | Block (EBS) | File (EFS/FSx) |
|---|---|---|---|
| Unit of Data | Object (Key/Value) | Fixed-size Block | File (Hierarchical) |
| Access Method | HTTP/HTTPS (REST API) | Mounted Drive (Fiber/iSCSI) | Network Protocol (NFS/SMB) |
| Scalability | Virtually Unlimited | Fixed Volume Size | Elastic/Auto-scaling |
| Best Use Case | Static Media, Backups | OS Boot, Databases | Shared Home Directories |
| AWS Service | Amazon S3 | Amazon EBS | Amazon EFS / FSx |
Hierarchical Outline
- Block Storage (The "Local" Disk)
- Characteristics: Managed by OS filesystem (NTFS, Ext4); Low latency; Fixed capacity.
- AWS Implementation: Amazon EBS (Persistent) and Instance Store (Ephemeral).
- Performance: Measured in IOPS and Throughput.
- Object Storage (The "Web" Storage)
- Characteristics: Flat hierarchy (Key-Value); Metadata-rich; Accessible via URL.
- AWS Implementation: Amazon S3.
- Features: Lifecycles, Versioning, and Storage Classes (Standard, Glacier).
- File Storage (The "Shared" Drive)
- Characteristics: Hierarchical (folders/subfolders); Simultaneous multi-instance access.
- AWS Implementation:
- EFS: Linux/NFS focus.
- FSx: Specialized for Windows (SMB), Lustre (HPC), or NetApp ONTAP.
Visual Anchors
Storage Selection Logic
Visualization of Data Structures
\begin{center} \begin{tikzpicture} % Block Storage Box \draw[thick] (0,3) rectangle (3,5); \node at (1.5,5.3) {\textbf{Block Storage (EBS)}}; \draw (0.2,3.2) rectangle (0.8,3.8); \draw (1.2,3.2) rectangle (1.8,3.8); \draw (2.2,3.2) rectangle (2.8,3.8); \draw (0.2,4.2) rectangle (0.8,4.8); \draw (1.2,4.2) rectangle (1.8,4.8); \draw (2.2,4.2) rectangle (2.8,4.8); \node[scale=0.7] at (1.5,2.8) {Fixed Segments (Blocks)};
% Object Storage Box \draw[thick] (5,3) rectangle (8,5); \node at (6.5,5.3) {\textbf{Object Storage (S3)}}; \draw[fill=gray!20] (5.5,3.5) circle (0.4cm) node[scale=0.6] {Data}; \draw[fill=gray!20] (7.2,4.3) circle (0.5cm) node[scale=0.6] {Metadata}; \node[scale=0.7] at (6.5,2.8) {Flat Namespace (ID/Key)}; \end{tikzpicture} \end{center}
Definition-Example Pairs
- Ephemeral Storage: Temporary storage that is deleted when the instance stops.
- Example: EC2 Instance Store used for swap files or temporary caches.
- Write-Once/Read-Many (WORM): Data that cannot be altered once written.
- Example: S3 Object Lock used for regulatory compliance to prevent file deletion.
- Data Striping (RAID 0): Spreading data across multiple disks to increase performance.
- Example: Combining two 500 GiB EBS volumes into a 1 TiB RAID 0 array to double the IOPS for a database.
Worked Examples
Example 1: Constructing an S3 URL
You have a file named report.pdf in a bucket named finance-dept. What is the virtual-hosted style URL?
- Pattern:
bucketname.s3.region.amazonaws.com/filename - Result:
https://finance-dept.s3.us-east-1.amazonaws.com/report.pdf
Example 2: Shared Storage for a Web Fleet
A fleet of 10 Linux EC2 instances needs to share a common directory of user-uploaded images.
- Solution: Mount an Amazon EFS volume to all 10 instances.
- Reasoning: S3 is too slow for frequent small file writes, and EBS volumes can only be attached to one instance at a time (unless using specific Multi-Attach IOPS volumes, which have OS limitations).
Checkpoint Questions
- Which storage type provides a "flat surface" for data and uses a Unique ID for retrieval?
- True or False: Amazon EBS volumes provide durability by automatically replicating data across multiple Availability Zones.
- What is the standard delimiter used in S3 to mimic a directory structure?
- If you need a high-performance filesystem for a Windows-based application using the SMB protocol, which service should you choose?
▶Click to see answers
- Object Storage (Amazon S3).
- False. EBS volumes are replicated within a single Availability Zone (S3 is the one replicated across multiple AZs).
- The forward slash ( / ).
- Amazon FSx for Windows File Server.