Study Guide1,142 words

AWS Storage Services and Replication Strategies: SAP-C02 Study Guide

AWS storage services and replication strategies (for example Amazon S3, Amazon RDS, Amazon ElastiCache)

AWS Storage Services and Replication Strategies

This guide covers the architectural patterns for storage and data replication on AWS, focusing on high availability, disaster recovery, and performance optimization as required for the AWS Certified Solutions Architect - Professional (SAP-C02) exam.

Learning Objectives

By the end of this module, you should be able to:

  • Distinguish between Multi-AZ (High Availability) and Multi-Region (Disaster Recovery) replication strategies.
  • Select the appropriate AWS storage service based on RTO/RPO requirements and access patterns.
  • Implement replication for Amazon S3, RDS, and DynamoDB to meet global availability goals.
  • Evaluate the role of Amazon ElastiCache and MemoryDB in accelerating data store performance.
  • Design hybrid storage solutions using AWS Storage Gateway and DataSync.

Key Terms & Glossary

  • RPO (Recovery Point Objective): The maximum acceptable amount of data loss measured in time (e.g., "We can afford to lose 5 minutes of data").
  • RTO (Recovery Time Objective): The maximum acceptable downtime to restore service after a failure (e.g., "The system must be back up in 30 minutes").
  • Synchronous Replication: Data is written to the primary and secondary locations simultaneously before the write is acknowledged. Ensures high durability (Multi-AZ RDS).
  • Asynchronous Replication: Data is written to the primary first, then copied to the secondary after a short delay. Provides better performance but risks small data loss during a crash (RDS Read Replicas, S3 CRR).
  • Zonal Service: A service where resources are tied to a specific Availability Zone (e.g., EC2, EBS).
  • Regional Service: A service that inherently spans multiple AZs within a region (e.g., S3, DynamoDB).

The "Big Idea"

In AWS architecture, storage is not just about capacity; it is about geography. The Solutions Architect's primary challenge is balancing the Consistency of data against the Availability and Latency of that data. While Multi-AZ setups protect against data center failures, Multi-Region architectures are required for catastrophic regional events or to serve a global user base with minimal latency.

Formula / Concept Box

ConceptPrimary GoalReplication TypeTypical Service
Multi-AZHigh Availability (HA)SynchronousRDS (Multi-AZ Deployment)
Read ReplicasRead Scaling / PerformanceAsynchronousRDS / Aurora
Cross-Region (CRR)Disaster Recovery (DR)AsynchronousS3, RDS, DynamoDB
In-Memory CacheSub-millisecond LatencyN/A (Volatile)ElastiCache (Memcached)

Hierarchical Outline

  1. Amazon S3 Replication
    • CRR (Cross-Region Replication): Automatic, asynchronous copying of objects across buckets in different regions.
    • SRR (Same-Region Replication): Copying objects within the same region (e.g., for log aggregation).
  2. Database Replication (RDS & Aurora)
    • Multi-AZ: Synchronous standby in a different AZ for failover; no performance benefit for reads.
    • Read Replicas: Asynchronous; supports up to 15 replicas (Aurora) or 5-15 (RDS) for scaling reads.
    • Global Database (Aurora): Low-latency global reads and fast DR.
  3. In-Memory Storage
    • ElastiCache: Caching layer (Redis/Memcached) to reduce database load.
    • MemoryDB for Redis: Durable, Multi-AZ in-memory database for microsecond reads.
  4. Hybrid Storage & Migration
    • Storage Gateway: Connects on-premises to S3 (File, Volume, Tape modes).
    • DataSync: Online data transfer for migration or recurring syncs.

Visual Anchors

Database Replication Flow

Loading Diagram...

S3 Replication Logic

Compiling TikZ diagram…
Running TeX engine…
This may take a few seconds

Definition-Example Pairs

  • Cross-Region Replication (CRR):
    • Definition: A bucket-level feature that automatically replicates every uploaded object to a destination bucket in a different AWS Region.
    • Example: A financial firm stores transaction logs in us-east-1 and uses CRR to replicate them to us-west-2 to meet compliance requirements for geographic data redundancy.
  • Cached Volume Gateway:
    • Definition: A Storage Gateway mode that stores the full dataset in S3 while keeping frequently accessed data in a local cache.
    • Example: A corporate office with limited local storage uses Cached Volumes to provide users fast access to current project files while archiving terabytes of older data in S3.

Worked Examples

Problem: Multi-Region Disaster Recovery for RDS

Scenario: A company uses Amazon RDS for PostgreSQL. They require an RTO of 4 hours and an RPO of 15 minutes in the event of a regional failure.

Step-by-Step Breakdown:

  1. Identify the Replication Method: Since it is multi-region, Multi-AZ is insufficient (as it only covers AZ failure). We must use Cross-Region Read Replicas.
  2. Configure Replication: Create a Read Replica in the recovery region (us-west-2).
  3. Monitor Lag: Use CloudWatch metric ReplicaLag to ensure the asynchronous delay stays under the 15-minute RPO.
  4. Failover Procedure:
    • In a disaster, manually promote the Read Replica to a standalone DB instance.
    • Update the Application configuration or Route 53 CNAME to point to the new DB endpoint in the recovery region.

Checkpoint Questions

  1. What is the main difference between RDS Multi-AZ and RDS Read Replicas regarding data consistency?
  2. Which Storage Gateway mode should you use if you want to replace physical tape libraries for long-term archiving?
  3. True or False: Amazon ElastiCache is primarily used as a persistent data store for critical records.
  4. How does Amazon Aurora Global Database improve RTO compared to standard RDS Read Replicas?

Muddy Points & Cross-Refs

  • ElastiCache vs. MemoryDB: This is a common point of confusion. Remember: ElastiCache is a cache (volatile by default, high speed), while MemoryDB is a database (durable, Multi-AZ transactional log, slightly higher latency than ElastiCache but still microsecond).
  • S3 Replication vs. S3 Sync: S3 Replication (CRR/SRR) is a continuous service feature. aws s3 sync is a CLI command that compares directories and copies differences; it is not an automated, server-side background process.
  • Cross-Ref: For routing traffic after a storage failover, refer to Route 53 Routing Policies (Failover/Latency).

Comparison Tables

Purpose-Built Database Comparison

ServiceTypeKey Use CaseReplication Capability
RDSRelational (SQL)Traditional ERP/CRMMulti-AZ (Sync), Read Replicas (Async)
DynamoDBNoSQL (KV)High-scale web appsGlobal Tables (Multi-Active, Async)
ElastiCacheIn-MemorySession storage, cachingRedis Replication Groups
QLDBLedgerImmutable transaction logsRegional (Journal-based)
TimestreamTime-seriesIoT/Telemetry dataMulti-AZ managed by AWS

[!IMPORTANT] For the SAP-C02 exam, always check if the requirement is for High Availability (within a region) or Disaster Recovery (across regions). Multi-AZ is the answer for HA; Cross-Region Replication or Global Tables is the answer for DR.

Ready to study AWS Certified Solutions Architect - Professional (SAP-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free