Determining an Appropriate Database Type

Determining the right database involves balancing requirements for data structure, scalability, performance, and management overhead. In AWS, this primarily focuses on the choice between Relational (SQL) services like Amazon RDS and Aurora, and Non-relational (NoSQL) services like DynamoDB.

Learning Objectives

Distinguish between relational (SQL) and non-relational (NoSQL) data models.
Identify the specific database engines supported by Amazon RDS and their licensing models.
Compare Amazon Aurora architecture with standard RDS deployments (Multi-AZ vs. Read Replicas).
Design DynamoDB tables using appropriate primary keys (Simple vs. Composite).
Calculate DynamoDB throughput requirements using Read Capacity Units (RCUs) and Write Capacity Units (WCUs).

Key Terms & Glossary

Relational Database (SQL): A database that requires a predefined schema (attributes) before data insertion. Examples: MySQL, PostgreSQL.
Non-relational Database (NoSQL): A database that allows for unstructured data; only a primary key is required for table creation. Example: DynamoDB.
Multi-AZ Deployment: A high-availability feature that synchronously replicates data to a standby instance in a different Availability Zone.
Read Replica: An asynchronous copy of a database used to offload read traffic and improve performance.
Partition Key: A simple primary key attribute used by DynamoDB's internal hash function to distribute data across physical partitions.
Sort Key: A second attribute in a composite primary key used to store items with the same partition key in a sorted order.

The "Big Idea"

In AWS, there is no "best" database, only the best for your access pattern. Relational databases are chosen when you need complex joins and strict data integrity (ACID compliance), while NoSQL databases like DynamoDB are chosen for massive scale, low-latency key-value access, and serverless architectures where you want to avoid managing instances.

Formula / Concept Box

Feature	Amazon RDS / Aurora (Relational)	Amazon DynamoDB (NoSQL)
Schema	Fixed / Rigid	Dynamic / Flexible
Scaling	Vertical (Larger Instance) / Read Replicas	Horizontal (Partitioning)
Consistency	Strong Consistency	Eventual (default) or Strong
Primary Use	ERP, CRM, Traditional Apps	Mobile, Web, Gaming, IoT
Throughput	Managed by Instance Type	Managed by Capacity Units (RCU/WCU)

DynamoDB Throughput Calculation

$1 \text{ WCU} = 1 \text{ write per second for an item up to 1 KB}$ $1 \text{ RCU} = 1 \text{ strongly consistent read per second for an item up to 4 KB}$ $1 \text{ RCU} = 2 \text{ eventually consistent reads per second for an item up to 4 KB}$

Hierarchical Outline

I. Relational Database Service (RDS)
- Supported Engines: MySQL, MariaDB, Oracle, PostgreSQL, Microsoft SQL Server, and Amazon Aurora.
- Instance Classes: Standard, Memory Optimized (for heavy DB workloads), and Burstable.
- Storage Types: General Purpose SSD (gp2), Provisioned IOPS (io1), and Magnetic.
II. Amazon Aurora
- Performance: Up to 5x faster than standard MySQL; 3x faster than PostgreSQL.
- Replication: Data is replicated 6-ways across 3 AZs; uses shared storage for fast failover.
- Serverless v2: Scales compute capacity automatically based on application demand.
III. Amazon DynamoDB
- Primary Keys: Simple (Partition Key) vs. Composite (Partition + Sort Key).
- Global Tables: Multi-region, multi-active replication for global applications.
- Consistency: Eventual consistency is cheaper; Strong consistency ensures the latest data.

Visual Anchors

Database Selection Flowchart

Loading Diagram...

DynamoDB Composite Key Structure

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Definition-Example Pairs

Multi-AZ: A synchronous failover target. Example: A banking application uses Multi-AZ so that if AZ-1 goes down, the database immediately fails over to AZ-2 with zero data loss.
Read Replica: An asynchronous copy for scaling. Example: An e-commerce site uses read replicas to handle high traffic on the "Product Catalog" page without slowing down the "Checkout" database.
Composite Key: A two-part identifier. Example: In a chat app, the ChannelID is the Partition Key and the MessageTimestamp is the Sort Key, allowing you to retrieve all messages for one channel sorted by time.

Worked Examples

Requirement: A feed that needs to handle millions of users with sub-millisecond latency for simple key-value lookups. Selection: Amazon DynamoDB. Reasoning: DynamoDB is designed for massive horizontal scale and provides consistent single-digit millisecond performance that traditional SQL databases struggle to maintain at that volume.

Scenario 2: Legacy Financial Reporting

Requirement: An application requires complex SQL queries with multiple table joins and must support an existing PostgreSQL engine. Selection: Amazon Aurora (PostgreSQL Compatible). Reasoning: Aurora provides the best performance for SQL workloads while maintaining compatibility with existing PostgreSQL drivers and supporting the complex relational logic required for reporting.

Checkpoint Questions

Which RDS feature provides synchronous replication to a standby instance for high availability?
If an item is 3 KB in size and you perform 10 strongly consistent reads per second, how many RCUs are required?
True or False: A DynamoDB table with a simple primary key allows multiple items to have the same Partition Key.
What is the main difference between an Aurora Replica and a standard RDS Read Replica regarding storage?
Which storage type is best for large, sequential cold data in RDS: gp2, io1, or sc1?

▶Click to see Answers

Multi-AZ Deployment.
10 RCUs (Each read is up to 4KB; 10 reads * 1 RCU per read).
False (With a simple primary key, the Partition Key must be unique).
Aurora Replicas share the same underlying storage volume as the primary instance, while RDS Read Replicas have their own separate storage copy.
sc1 (Cold HDD).

Determining an Appropriate Database Type

Learning Objectives

Distinguish between relational (SQL) and non-relational (NoSQL) data models.
Identify the specific database engines supported by Amazon RDS and their licensing models.
Compare Amazon Aurora architecture with standard RDS deployments (Multi-AZ vs. Read Replicas).
Design DynamoDB tables using appropriate primary keys (Simple vs. Composite).
Calculate DynamoDB throughput requirements using Read Capacity Units (RCUs) and Write Capacity Units (WCUs).

Key Terms & Glossary

Relational Database (SQL): A database that requires a predefined schema (attributes) before data insertion. Examples: MySQL, PostgreSQL.
Non-relational Database (NoSQL): A database that allows for unstructured data; only a primary key is required for table creation. Example: DynamoDB.
Multi-AZ Deployment: A high-availability feature that synchronously replicates data to a standby instance in a different Availability Zone.
Read Replica: An asynchronous copy of a database used to offload read traffic and improve performance.
Partition Key: A simple primary key attribute used by DynamoDB's internal hash function to distribute data across physical partitions.
Sort Key: A second attribute in a composite primary key used to store items with the same partition key in a sorted order.

The "Big Idea"

Formula / Concept Box

Feature	Amazon RDS / Aurora (Relational)	Amazon DynamoDB (NoSQL)
Schema	Fixed / Rigid	Dynamic / Flexible
Scaling	Vertical (Larger Instance) / Read Replicas	Horizontal (Partitioning)
Consistency	Strong Consistency	Eventual (default) or Strong
Primary Use	ERP, CRM, Traditional Apps	Mobile, Web, Gaming, IoT
Throughput	Managed by Instance Type	Managed by Capacity Units (RCU/WCU)

DynamoDB Throughput Calculation

Hierarchical Outline

I. Relational Database Service (RDS)
- Supported Engines: MySQL, MariaDB, Oracle, PostgreSQL, Microsoft SQL Server, and Amazon Aurora.
- Instance Classes: Standard, Memory Optimized (for heavy DB workloads), and Burstable.
- Storage Types: General Purpose SSD (gp2), Provisioned IOPS (io1), and Magnetic.
II. Amazon Aurora
- Performance: Up to 5x faster than standard MySQL; 3x faster than PostgreSQL.
- Replication: Data is replicated 6-ways across 3 AZs; uses shared storage for fast failover.
- Serverless v2: Scales compute capacity automatically based on application demand.
III. Amazon DynamoDB
- Primary Keys: Simple (Partition Key) vs. Composite (Partition + Sort Key).
- Global Tables: Multi-region, multi-active replication for global applications.
- Consistency: Eventual consistency is cheaper; Strong consistency ensures the latest data.

Visual Anchors

Database Selection Flowchart

Loading Diagram...

DynamoDB Composite Key Structure

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Definition-Example Pairs

Multi-AZ: A synchronous failover target. Example: A banking application uses Multi-AZ so that if AZ-1 goes down, the database immediately fails over to AZ-2 with zero data loss.
Read Replica: An asynchronous copy for scaling. Example: An e-commerce site uses read replicas to handle high traffic on the "Product Catalog" page without slowing down the "Checkout" database.
Composite Key: A two-part identifier. Example: In a chat app, the ChannelID is the Partition Key and the MessageTimestamp is the Sort Key, allowing you to retrieve all messages for one channel sorted by time.

Worked Examples

Scenario 2: Legacy Financial Reporting

Checkpoint Questions

Which RDS feature provides synchronous replication to a standby instance for high availability?
If an item is 3 KB in size and you perform 10 strongly consistent reads per second, how many RCUs are required?
True or False: A DynamoDB table with a simple primary key allows multiple items to have the same Partition Key.
What is the main difference between an Aurora Replica and a standard RDS Read Replica regarding storage?
Which storage type is best for large, sequential cold data in RDS: gp2, io1, or sc1?

▶Click to see Answers

Multi-AZ Deployment.
10 RCUs (Each read is up to 4KB; 10 reads * 1 RCU per read).
False (With a simple primary key, the Partition Key must be unique).
Aurora Replicas share the same underlying storage volume as the primary instance, while RDS Read Replicas have their own separate storage copy.
sc1 (Cold HDD).

AWS Database Selection: RDS, Aurora, and DynamoDB Study Guide

Determining an Appropriate Database Type

Learning Objectives

Key Terms & Glossary

The "Big Idea"

Formula / Concept Box

DynamoDB Throughput Calculation

Hierarchical Outline

Visual Anchors

Database Selection Flowchart

DynamoDB Composite Key Structure

Definition-Example Pairs

Worked Examples

Scenario 2: Legacy Financial Reporting

Checkpoint Questions

AWS Database Selection: RDS, Aurora, and DynamoDB Study Guide

Determining an Appropriate Database Type

Learning Objectives

Key Terms & Glossary

The "Big Idea"

Formula / Concept Box

DynamoDB Throughput Calculation

Hierarchical Outline

Visual Anchors

Database Selection Flowchart

DynamoDB Composite Key Structure

Definition-Example Pairs

Worked Examples

Scenario 2: Legacy Financial Reporting

Checkpoint Questions

AWS Database Selection: RDS, Aurora, and DynamoDB Study Guide

Determining an Appropriate Database Type

Learning Objectives

Key Terms & Glossary

The "Big Idea"

Formula / Concept Box

DynamoDB Throughput Calculation

Hierarchical Outline

Visual Anchors

Database Selection Flowchart

DynamoDB Composite Key Structure

Definition-Example Pairs

Worked Examples

Scenario 1: High-Performance Social Media Feed

Scenario 2: Legacy Financial Reporting

Checkpoint Questions

AWS Database Selection: RDS, Aurora, and DynamoDB Study Guide

Determining an Appropriate Database Type

Learning Objectives

Key Terms & Glossary

The "Big Idea"

Formula / Concept Box

DynamoDB Throughput Calculation

Hierarchical Outline

Visual Anchors

Database Selection Flowchart

DynamoDB Composite Key Structure

Definition-Example Pairs

Worked Examples

Scenario 1: High-Performance Social Media Feed

Scenario 2: Legacy Financial Reporting

Checkpoint Questions