Study Guide1,120 words

Mastering AWS Database Architectures: SAP-C02 Study Guide

Databases (for example, Amazon DynamoDB, Amazon OpenSearch Service, Amazon RDS, self-managed databases on Amazon EC2)

Mastering AWS Database Architectures: SAP-C02 Study Guide

This guide covers the critical database selection, design, and migration skills required for the AWS Certified Solutions Architect - Professional (SAP-C02) exam. We focus on choosing the right tool for the right job, ensuring high availability, and optimizing performance.

Learning Objectives

  • Evaluate business requirements to select the appropriate AWS database service (RDS, DynamoDB, Aurora, etc.).
  • Design high-availability and disaster recovery architectures for relational and non-relational workloads.
  • Implement caching and performance optimization strategies using ElastiCache and DAX.
  • Differentiate between managed services and self-managed databases on Amazon EC2.
  • Strategize database migrations using AWS DMS and SCT.

Key Terms & Glossary

  • ACID Compliance: Atomicity, Consistency, Isolation, Durability. Standard for relational databases (RDS/Aurora) to ensure reliable transactions.
  • Multi-AZ Deployment: A high-availability feature that provides synchronous data replication to a standby instance in a different Availability Zone.
  • Read Replica: Asynchronous replication used to scale read-heavy workloads. Available in RDS and Aurora.
  • Partition Key: The primary attribute in DynamoDB used to distribute data across physical storage partitions.
  • GSI (Global Secondary Index): An index with a partition key and a sort key that can be different from those on the base DynamoDB table.
  • DAX (DynamoDB Accelerator): An in-memory cache for DynamoDB that reduces response times from milliseconds to microseconds.

The "Big Idea"

The core philosophy of AWS databases is "Purpose-Built." Instead of forcing every workload into a traditional relational database, architects must decompose applications into components that use the storage engine best suited for their access patterns (e.g., Key-Value for sessions, Relational for ERP, Graph for social links).

Formula / Concept Box

FeatureAmazon RDS / AuroraAmazon DynamoDBSelf-Managed (EC2)
ScalingVertical (Instance size) + Read ReplicasHorizontal (Partitioning)Manual / Vertical
HA/DRMulti-AZ (Automatic Failover)Global Tables (Active-Active)Manual Configuration
SchemaFixed / StructuredFlexible / NoSQLFull Control
ManagementFully Managed (Patching/Backups)ServerlessCustomer Managed

Hierarchical Outline

  1. Relational Databases (SQL)
    • Amazon RDS: Managed service for MySQL, PostgreSQL, MariaDB, Oracle, and SQL Server.
    • Amazon Aurora: Cloud-native relational database; 5x throughput of MySQL; 15 read replicas; storage auto-healing.
  2. Non-Relational Databases (NoSQL)
    • Amazon DynamoDB: Key-value and document store; single-digit millisecond latency at any scale; serverless.
    • Amazon DocumentDB: MongoDB-compatible JSON store; decoupled compute and storage.
  3. Specialized Data Stores
    • Amazon Neptune: Graph database for highly connected datasets (social, fraud detection).
    • Amazon OpenSearch: Search and log analytics; replacement for Elasticsearch.
    • Amazon ElastiCache: In-memory caching (Redis/Memcached) for sub-millisecond data retrieval.
  4. Self-Managed Databases
    • Amazon EC2: Required when a DB engine is not supported by RDS, or when OS-level access/specific plugins are mandatory.

Visual Anchors

Database Selection Flowchart

Loading Diagram...

RDS Multi-AZ vs. Read Replica

Compiling TikZ diagram…
Running TeX engine…
This may take a few seconds

Definition-Example Pairs

  • Amazon Neptune: A graph database designed for complex relationships. Example: A social media platform tracking "friends of friends" or a recommendation engine suggesting products based on shared interests.
  • Amazon OpenSearch Service: A search and analytics engine. Example: A retail website allowing customers to filter and search through millions of product descriptions and logs in real-time.
  • Amazon Timestream: A time-series database. Example: An IoT sensor grid recording temperature readings every second for historical trend analysis.

Worked Examples

Example 1: Global E-Commerce Shopping Cart

Scenario: A retailer needs a globally distributed shopping cart system that handles millions of requests during Black Friday. Solution: Use Amazon DynamoDB with Global Tables.

  1. Enable Global Tables to replicate data across regions with sub-second latency.
  2. Use On-Demand Capacity Mode to handle the unpredictable spike in traffic without manual scaling.
  3. Implement DAX to ensure that even during peak load, repeated item reads remain at microsecond speeds.

Example 2: Migrating a Legacy SQL Server to AWS

Scenario: An enterprise wants to migrate a SQL Server database but requires OS-level access for a custom third-party auditing plugin. Solution: Deploy SQL Server on Amazon EC2.

  1. RDS is usually preferred, but the "OS-level access" requirement forces a self-managed approach.
  2. Use Amazon EBS Provisioned IOPS (io2) for high-performance storage.
  3. Configure Always On Availability Groups for high availability manually.

Checkpoint Questions

  1. Which database service is best for storing and querying complex JSON documents with MongoDB compatibility?
  2. In Amazon RDS, does a Read Replica provide automatic failover for high availability?
  3. When should an architect choose DynamoDB On-Demand capacity over Provisioned capacity?
  4. What is the main difference between synchronous and asynchronous replication in the context of RDS Multi-AZ?

Muddy Points & Cross-Refs

  • Aurora Serverless vs. DynamoDB: Both are serverless, but Aurora is for relational (SQL) data while DynamoDB is for NoSQL. Choose Aurora if you have complex joins or a fixed schema; choose DynamoDB for infinite horizontal scale.
  • DMS vs. SCT: The Schema Conversion Tool (SCT) is used before migration to convert a source schema (e.g., Oracle) to a target schema (e.g., PostgreSQL). Database Migration Service (DMS) is the engine that actually moves the data.
  • ElastiCache vs. DAX: Use ElastiCache for RDS or general application caching. Use DAX exclusively for DynamoDB.

Comparison Tables

RDS vs. Aurora

FeatureAmazon RDSAmazon Aurora
StorageFixed size (must scale manually/auto)Auto-scaling (up to 128 TiB)
ReplicationUp to 5-15 Replicas (engine dependent)Up to 15 Replicas (low lag)
Self-HealingNo (requires snapshots/backups)Yes (storage is distributed and self-healing)
PerformanceStandard engine performance3x to 5x throughput improvement

Ready to study AWS Certified Solutions Architect - Professional (SAP-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free