Study Guide925 words

Selecting the Appropriate AWS Backup and Archival Solution

Selecting the appropriate backup and/or archival solution

Selecting the Appropriate AWS Backup and Archival Solution

This guide covers the essential strategies and services for ensuring data durability, availability, and long-term retention within the AWS ecosystem, specifically tailored for the SAA-C03 exam.

Learning Objectives

After studying this guide, you will be able to:

  • Differentiate between backup (recovery) and archival (long-term storage) use cases.
  • Evaluate retrieval tiers for S3 Glacier based on cost and time constraints.
  • Define and apply Recovery Time Objective (RTO) and Recovery Point Objective (RPO) to architecture designs.
  • Select the appropriate backup mechanism for EBS, RDS, EFS, and S3.
  • Implement automated data lifecycle policies and centralized backup management.

Key Terms & Glossary

  • RPO (Recovery Point Objective): The maximum acceptable period of data loss measured in time (e.g., "We can afford to lose 5 minutes of data").
  • RTO (Recovery Time Objective): The maximum acceptable time to recover and resume processing after a failure.
  • Snapshot: A point-in-time, incremental backup of a storage volume (EBS) or database (RDS) stored in S3.
  • WORM (Write Once Read Many): A data storage technology that prevents the erasure or modification of data once written; enforced in AWS via S3 Object Lock or Glacier Vault Lock.
  • Versioning: An S3 feature that preserves multiple variants of an object in the same bucket to protect against accidental deletes or overwrites.

The "Big Idea"

In AWS, backup is about operational resilience—getting your system back online quickly with minimal data loss. Archival is about cost-optimization and compliance—moving data you rarely access to the cheapest possible medium while meeting legal retention requirements. A successful Solutions Architect balances the "Three Cs": Cost, Compliance, and Continuity.

Formula / Concept Box

ConceptDescriptionMetric/Rule
S3 VersioningPrevents accidental deletionRequired for CRR/SRR
RPOData Loss ToleranceTimeLastBackupTimeFailureTime_{Last Backup} - Time_{Failure}
RTODowntime ToleranceTimeSystemUpTimeFailureTime_{System Up} - Time_{Failure}
Glacier InstantMillisecond retrievalBest for medical images/active archives
Glacier Flexible1 min - 12 hours retrievalBest for backups needed in 1 day
Glacier Deep Archive12 - 48 hours retrievalCheapest; for 7-10 year compliance

Hierarchical Outline

  1. Object Storage Backup (Amazon S3)
    • Versioning: Protects against overwrites; keeps all historical states.
    • Replication:
      • SRR (Same-Region): For log aggregation or dev/test sync.
      • CRR (Cross-Region): For disaster recovery and geographic compliance.
  2. Block & File Storage Backup
    • EBS Snapshots: Incremental backups; managed via Data Lifecycle Manager (DLM).
    • EFS Backup: Managed via AWS Backup for file-level recovery.
  3. Database Backup (RDS)
    • Automated Backups: Daily snapshots + transaction logs (5-min RPO).
    • Manual Snapshots: User-initiated; persist even after DB instance deletion.
  4. Centralized Governance
    • AWS Backup: A fully managed service to centralize and automate backups across 12+ AWS services.

Visual Anchors

Backup Strategy Decision Flow

Loading Diagram...

RTO vs RPO Visualization

\begin{tikzpicture}[scale=0.8] % Timeline \draw[thick, ->] (0,0) -- (12,0) node[right] {Time};

code
% Disaster point \filldraw[red] (6,0) circle (3pt) node[above=5pt] {\textbf{Disaster}}; \draw[red, dashed] (6,0) -- (6,-2); % RPO \draw[blue, thick, <->] (4, -0.5) -- (6, -0.5); \node[blue, below] at (5, -0.5) {RPO (Data Loss)}; \draw[blue, dashed] (4,0) -- (4,-1) node[below] {Last Backup}; % RTO \draw[orange, thick, <->] (6, -1.5) -- (9, -1.5); \node[orange, below] at (7.5, -1.5) {RTO (Downtime)}; \draw[orange, dashed] (9,0) -- (9,-2) node[below] {Service Restored};

\end{tikzpicture}

Definition-Example Pairs

  • Point-in-Time Recovery (PITR): The ability to restore a database to any specific second within a retention period.
    • Example: A developer accidentally runs DROP TABLE at 10:05 AM; the admin restores the RDS instance to 10:04 AM using transaction logs.
  • Vault Lock Policy: A WORM policy for Glacier vaults that is locked once initiated.
    • Example: A financial firm must store tax records for 7 years and ensure they cannot be deleted even by the Root user to satisfy SEC regulations.
  • Incremental Snapshot: A backup that only saves the blocks that have changed since the last snapshot.
    • Example: If you have a 100GB EBS volume but only 2GB changed today, the next snapshot only consumes 2GB of S3 storage costs.

Worked Examples

Example 1: Choosing an Archival Tier

Scenario: A media company has 500 TB of raw footage. They rarely access it, but if a producer requests a clip, they need it within 5-10 minutes. Which storage class is best?

  • Analysis:
    • S3 Glacier Deep Archive: Too slow (12-hour minimum).
    • S3 Glacier Flexible: Standard retrieval is hours; expedited is 1-5 mins but costs more.
    • S3 Glacier Instant Retrieval: Provides millisecond access at a lower cost than S3 Standard-IA.
  • Solution: S3 Glacier Instant Retrieval balances the need for low-cost "cold" storage with the requirement for rapid retrieval.

Example 2: Disaster Recovery for RDS

Scenario: A company requires an RPO of 2 hours and an RTO of 4 hours for their production database across regions.

  • Analysis:
    • Default RDS snapshots are regional.
    • To achieve cross-region recovery, snapshots must be copied to a secondary region.
  • Solution: Enable Automated Backups and use AWS Backup or a Lambda function to Copy Snapshots to a different region every 2 hours to satisfy the RPO.

Checkpoint Questions

  1. Which S3 feature is a prerequisite for enabling Cross-Region Replication (CRR)?
  2. You need to back up EFS, RDS, and EBS from a single console. What service do you use?
  3. What is the main difference between a "Manual Snapshot" and an "Automated Snapshot" in Amazon RDS when you delete the instance?
  4. True or False: S3 Glacier Flexible Retrieval offers an "Expedited" tier that provides data in 1-5 minutes.
  5. How many availability zones is an EBS Snapshot stored in by default?
Click to see Answers
  1. Versioning must be enabled on both the source and destination buckets.
  2. AWS Backup.
  3. Manual snapshots are persisted after deletion; automated snapshots are deleted unless otherwise configured.
  4. True.
  5. Snapshots are stored in Amazon S3, which automatically replicates data across at least three AZs for high durability.

Ready to study AWS Certified Solutions Architect - Associate (SAA-C03)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free