AWS Database Selection: RDS, Aurora, and DynamoDB Study Guide
Determining an appropriate database type (for example, Amazon Aurora, Amazon DynamoDB)
Determining an Appropriate Database Type
Determining the right database involves balancing requirements for data structure, scalability, performance, and management overhead. In AWS, this primarily focuses on the choice between Relational (SQL) services like Amazon RDS and Aurora, and Non-relational (NoSQL) services like DynamoDB.
Learning Objectives
- Distinguish between relational (SQL) and non-relational (NoSQL) data models.
- Identify the specific database engines supported by Amazon RDS and their licensing models.
- Compare Amazon Aurora architecture with standard RDS deployments (Multi-AZ vs. Read Replicas).
- Design DynamoDB tables using appropriate primary keys (Simple vs. Composite).
- Calculate DynamoDB throughput requirements using Read Capacity Units (RCUs) and Write Capacity Units (WCUs).
Key Terms & Glossary
- Relational Database (SQL): A database that requires a predefined schema (attributes) before data insertion. Examples: MySQL, PostgreSQL.
- Non-relational Database (NoSQL): A database that allows for unstructured data; only a primary key is required for table creation. Example: DynamoDB.
- Multi-AZ Deployment: A high-availability feature that synchronously replicates data to a standby instance in a different Availability Zone.
- Read Replica: An asynchronous copy of a database used to offload read traffic and improve performance.
- Partition Key: A simple primary key attribute used by DynamoDB's internal hash function to distribute data across physical partitions.
- Sort Key: A second attribute in a composite primary key used to store items with the same partition key in a sorted order.
The "Big Idea"
In AWS, there is no "best" database, only the best for your access pattern. Relational databases are chosen when you need complex joins and strict data integrity (ACID compliance), while NoSQL databases like DynamoDB are chosen for massive scale, low-latency key-value access, and serverless architectures where you want to avoid managing instances.
Formula / Concept Box
| Feature | Amazon RDS / Aurora (Relational) | Amazon DynamoDB (NoSQL) |
|---|---|---|
| Schema | Fixed / Rigid | Dynamic / Flexible |
| Scaling | Vertical (Larger Instance) / Read Replicas | Horizontal (Partitioning) |
| Consistency | Strong Consistency | Eventual (default) or Strong |
| Primary Use | ERP, CRM, Traditional Apps | Mobile, Web, Gaming, IoT |
| Throughput | Managed by Instance Type | Managed by Capacity Units (RCU/WCU) |
DynamoDB Throughput Calculation
Hierarchical Outline
- I. Relational Database Service (RDS)
- Supported Engines: MySQL, MariaDB, Oracle, PostgreSQL, Microsoft SQL Server, and Amazon Aurora.
- Instance Classes: Standard, Memory Optimized (for heavy DB workloads), and Burstable.
- Storage Types: General Purpose SSD (gp2), Provisioned IOPS (io1), and Magnetic.
- II. Amazon Aurora
- Performance: Up to 5x faster than standard MySQL; 3x faster than PostgreSQL.
- Replication: Data is replicated 6-ways across 3 AZs; uses shared storage for fast failover.
- Serverless v2: Scales compute capacity automatically based on application demand.
- III. Amazon DynamoDB
- Primary Keys: Simple (Partition Key) vs. Composite (Partition + Sort Key).
- Global Tables: Multi-region, multi-active replication for global applications.
- Consistency: Eventual consistency is cheaper; Strong consistency ensures the latest data.
Visual Anchors
Database Selection Flowchart
DynamoDB Composite Key Structure
Definition-Example Pairs
- Multi-AZ: A synchronous failover target. Example: A banking application uses Multi-AZ so that if AZ-1 goes down, the database immediately fails over to AZ-2 with zero data loss.
- Read Replica: An asynchronous copy for scaling. Example: An e-commerce site uses read replicas to handle high traffic on the "Product Catalog" page without slowing down the "Checkout" database.
- Composite Key: A two-part identifier. Example: In a chat app, the
ChannelIDis the Partition Key and theMessageTimestampis the Sort Key, allowing you to retrieve all messages for one channel sorted by time.
Worked Examples
Scenario 1: High-Performance Social Media Feed
Requirement: A feed that needs to handle millions of users with sub-millisecond latency for simple key-value lookups. Selection: Amazon DynamoDB. Reasoning: DynamoDB is designed for massive horizontal scale and provides consistent single-digit millisecond performance that traditional SQL databases struggle to maintain at that volume.
Scenario 2: Legacy Financial Reporting
Requirement: An application requires complex SQL queries with multiple table joins and must support an existing PostgreSQL engine. Selection: Amazon Aurora (PostgreSQL Compatible). Reasoning: Aurora provides the best performance for SQL workloads while maintaining compatibility with existing PostgreSQL drivers and supporting the complex relational logic required for reporting.
Checkpoint Questions
- Which RDS feature provides synchronous replication to a standby instance for high availability?
- If an item is 3 KB in size and you perform 10 strongly consistent reads per second, how many RCUs are required?
- True or False: A DynamoDB table with a simple primary key allows multiple items to have the same Partition Key.
- What is the main difference between an Aurora Replica and a standard RDS Read Replica regarding storage?
- Which storage type is best for large, sequential cold data in RDS: gp2, io1, or sc1?
▶Click to see Answers
- Multi-AZ Deployment.
- 10 RCUs (Each read is up to 4KB; 10 reads * 1 RCU per read).
- False (With a simple primary key, the Partition Key must be unique).
- Aurora Replicas share the same underlying storage volume as the primary instance, while RDS Read Replicas have their own separate storage copy.
- sc1 (Cold HDD).