AWS Database Consistency Models: S3 and DynamoDB
Describe database consistency models (for example, strongly consistent, eventually consistent)
AWS Database Consistency Models: S3 and DynamoDB
This guide covers the fundamental consistency models used in AWS data stores, focusing on the trade-offs between data accuracy, performance, and cost as required for the DVA-C02 exam.
Learning Objectives
By the end of this chapter, you should be able to:
- Distinguish between Strongly Consistent and Eventually Consistent read models.
- Identify the consistency behavior of Amazon S3 for different operations (PUT, DELETE, HEAD).
- Calculate Read Capacity Units (RCUs) for DynamoDB based on the chosen consistency model.
- Understand the cost and performance implications of consistency choices in distributed systems.
Key Terms & Glossary
- Eventual Consistency: A model where the system guarantees that, if no new updates are made to an object, eventually all accesses will return the last updated value. In the short term, a read might return stale data.
- Strong Consistency: A model ensuring that a read request always returns the most recent version of the data, reflecting all successful prior writes.
- Read-after-Write Consistency: A specific type of strong consistency where a newly created object is immediately available for retrieval.
- RCU (Read Capacity Unit): A unit of throughput in DynamoDB representing one strongly consistent read per second (up to 4 KB) or two eventually consistent reads per second.
The "Big Idea"
In distributed systems like AWS, data is replicated across multiple Availability Zones (AZs) to ensure high availability. The "Big Idea" is the Trade-off: If you want the data immediately (Strong Consistency), you must wait for all replicas to synchronize, which increases cost and latency. If you can tolerate a slight delay (Eventual Consistency), the system is faster and cheaper because it can respond from the nearest (possibly stale) replica.
Formula / Concept Box
| Feature | Eventually Consistent Read | Strongly Consistent Read | Transactional Read |
|---|---|---|---|
| Data Freshness | May be stale | Guaranteed latest | Atomic/Guaranteed |
| Cost (RCUs) | 0.5 RCU per 4 KB | 1 RCU per 4 KB | 2 RCUs per 4 KB |
| Performance | Lowest Latency | Higher Latency | Highest Latency |
| AWS Default | DynamoDB Default | Must set ConsistentRead: true | Explicit API call |
Hierarchical Outline
- Amazon S3 Consistency Models
- New Objects (PUT): Provides Read-after-Write consistency. You can access the object immediately after the successful upload response.
- Overwrites and Deletes: Historically eventually consistent. Updates to existing keys or deletions may take time to propagate across all AZs.
- Special Case: Using HEAD or GET to check for an object's existence before creating it results in Read-after-Write Eventual Consistency.
- Amazon DynamoDB Consistency Models
- Eventually Consistent Reads: The default. Maximizes read throughput. Best for non-critical data (e.g., social media feeds).
- Strongly Consistent Reads: Returns the latest data. Required for financial or inventory-critical applications. Not supported on Global Secondary Indexes (GSIs).
- Adaptive Capacity: Automatically handles "Hot Partitions" by shifting throughput to partitions receiving high traffic.
Visual Anchors
Data Propagation Logic
This flowchart illustrates how a read request is handled based on the consistency model.
Replication Across AZs
This diagram shows the physical delay that causes eventual consistency.
\begin{tikzpicture}[node distance=2cm] \draw[thick] (0,0) circle (0.5cm) node {Write}; \draw[->, thick] (0.5,0) -- (1.5,1) node[midway, above] {AZ 1}; \draw[->, thick] (0.5,0) -- (1.5,0) node[midway, above] {AZ 2}; \draw[->, thick] (0.5,0) -- (1.5,-1) node[midway, below] {AZ 3};
\node[draw, rectangle] at (2.2, 1) (d1) {Data v2};
\node[draw, rectangle] at (2.2, 0) (d2) {Data v1};
\node[draw, rectangle] at (2.2, -1) (d3) {Data v2};
\node[text width=4cm, align=center] at (5, 0) {\textbf{Eventual Consistency:}\\A read from AZ 2 returns the old version (v1) until sync completes.};\end{tikzpicture}
Definition-Example Pairs
- Strongly Consistent Read
- Definition: A read that waits for a consensus among replicas to ensure the most recent write is returned.
- Example: A banking application checking an account balance before authorizing a withdrawal. You cannot risk seeing a stale (higher) balance.
- Eventually Consistent Read
- Definition: A read that returns data from a single replica immediately, regardless of whether it has been updated globally.
- Example: A YouTube view count. It doesn't matter if one user sees 1,000 views and another sees 1,002; they will eventually synchronize.
Worked Examples
Problem: DynamoDB RCU Calculation
Scenario: Your application needs to read 10 items per second. Each item is 8 KB in size.
1. If using Strongly Consistent Reads:
- Calculate units per item: $8 KB / 4 KB = 2$ units.
- Total RCUs = $10 \text{ items/sec} \times 2 \text{ units/item} = \mathbf{20 \text{ RCUs}}$.
2. If using Eventually Consistent Reads:
- Calculate units per item: unit.
- Total RCUs = $10 items/sec \times 1 unit/item = \mathbf{10 RCUs}.
3. If using Transactional Reads:
- Calculate units per item: (8 KB / 4 KB) \times 2 = 4$ units.
- Total RCUs = $10 \text{ items/sec} \times 4 \text{ units/item} = \mathbf{40 \text{ RCUs}}$.
Checkpoint Questions
- Which DynamoDB read type is twice as expensive as a strongly consistent read?
- (Answer: Transactional Reads)
- True or False: Amazon S3 provides read-after-write consistency for replacing an existing object with the same name.
- (Answer: False. Overwrites are eventually consistent in the context of the provided source.)
- If you perform a GET request on an S3 object immediately after a successful DELETE, what might happen?
- (Answer: You might still receive the object data because the deletion has not yet fully propagated.)
- To enable strongly consistent reads in a DynamoDB API call, which parameter must be set to True?
- (Answer:
ConsistentRead)
- (Answer:
[!TIP] On the exam, if a question mentions "lowest cost" for DynamoDB reads, always look for Eventually Consistent options. If it mentions "most up-to-date," choose Strongly Consistent.