AWS Data Caching Services: ElastiCache and CloudFront
Use data caching services
AWS Data Caching Services: ElastiCache and CloudFront
This study guide covers the implementation and optimization of data caching services on AWS, focusing on Amazon ElastiCache and Amazon CloudFront. Caching is a critical skill for the AWS Certified Developer - Associate (DVA-C02) exam, particularly for improving application performance and reducing database load.
Learning Objectives
- Distinguish between Amazon ElastiCache and Amazon CloudFront use cases.
- Compare the features and limitations of Redis vs. Memcached engines.
- Implement caching strategies including Lazy Loading and Write-through.
- Understand the role of Edge Locations and Regional Edge Caches in content delivery.
Key Terms & Glossary
- Cache Hit: When the requested data is found in the cache, resulting in fast retrieval.
- Cache Miss: When data is not in the cache, forcing the application to fetch it from the origin/database.
- TTL (Time-to-Live): A setting that determines how long an item remains in the cache before it expires.
- Origin: The primary source of data (e.g., an S3 bucket or an EC2 instance) that the cache retrieves data from.
- In-Memory: A data storage method that keeps data in the RAM for sub-millisecond latency.
The "Big Idea"
Caching is the art of trading memory for speed. By storing frequently accessed data or expensive computation results in a high-speed, in-memory data layer, you offload heavy read traffic from your primary databases (like RDS) and reduce latency for end-users. In AWS, this happens at the application level (ElastiCache) and the network level (CloudFront).
Formula / Concept Box
ElastiCache Engine Comparison
| Feature | Memcached | Redis |
|---|---|---|
| Data Types | Simple (Strings/Objects) | Complex (Lists, Sets, Sorted Sets, Hashes) |
| High Availability | No (Multi-node only) | Yes (Multi-AZ with Auto-Failover) |
| Persistence | No | Yes (AOF and Snapshots) |
| Scaling | Scale out/in (Add/Remove nodes) | Scale up/down or Resharding |
| Backup/Restore | No | Yes |
Hierarchical Outline
- I. Amazon ElastiCache
- Purpose: In-memory key-value store to offload relational databases.
- Use Cases: Session stores, recommendation engines, frequent SQL query results.
- II. Caching Strategies
- Lazy Loading: Data loaded only on a miss; avoids cache churn.
- Write-Through: Data written to cache and DB simultaneously; ensures data freshness.
- III. Amazon CloudFront
- Edge Locations: Physical sites that cache content closer to users.
- Regional Edge Caches: Mid-tier caches with larger capacity than standard edge locations.
Visual Anchors
Cache Strategy: Lazy Loading Flow
Caching Architecture Overview
\begin{tikzpicture}[node distance=2cm, every node/.style={draw, fill=blue!10, rounded corners, minimum width=2.5cm, minimum height=1cm, align=center}] \node (user) [fill=green!10] {End User}; \node (cf) [right of=user, xshift=1.5cm, fill=orange!10] {CloudFront \ (Edge)}; \node (app) [right of=cf, xshift=1.5cm] {App Server \ (EC2/Lambda)}; \node (cache) [below of=app, yshift=-0.5cm, fill=red!10] {ElastiCache \ (In-Memory)}; \node (db) [right of=app, xshift=1.5cm, fill=yellow!10] {RDS \ (Database)};
\draw[->, thick] (user) -- (cf);
\draw[->, thick] (cf) -- (app);
\draw[<->, thick] (app) -- (cache);
\draw[<->, thick] (app) -- (db);\end{tikzpicture}
Definition-Example Pairs
- Lazy Loading: A strategy where data is only cached when it is actually requested.
- Example: A news site only caches an article once the first user clicks on it, preventing the cache from filling up with articles no one reads.
- Session State: Data about a user's active session on a website.
- Example: A shopping cart's contents are stored in a Redis cluster so that if the web server restarts, the user doesn't lose their items.
- Content Delivery Network (CDN): A system of distributed servers that deliver content based on the geographic locations of the user.
- Example: A user in Tokyo downloads a video from a CloudFront edge location in Japan rather than fetching it from an S3 bucket in Virginia.
Worked Examples
Scenario: Handling Stale Data in Lazy Loading
Problem: You use Lazy Loading for a product catalog. A product's price changes in the RDS database, but users still see the old price in the app.
Step-by-Step Resolution:
- Identify the Issue: The cache has no way of knowing the DB changed (Cache Invalidation problem).
- Apply TTL: Implement a Time-to-Live (e.g., 300 seconds). After 5 minutes, the record expires automatically.
- Manual Eviction: Update the application code so that when an
UpdateProductAPI call is made, it also callscache.delete(product_id). - Result: The next request will result in a Cache Miss, fetching the new price from the DB and updating the cache.
Checkpoint Questions
- Which ElastiCache engine should you choose if you require multi-AZ replication and automatic failover?
- What is the primary disadvantage of the "Write-through" caching strategy regarding resource usage?
- How does a Regional Edge Cache differ from a standard Edge Location in CloudFront?
-
[!IMPORTANT] True or False: Amazon ElastiCache is a persistent data store intended to replace RDS for long-term storage.
▶Click to view answers
- Redis. Memcached does not support high availability or automatic failover.
- Resource Wastage. Most data written to the cache may never be read, consuming memory unnecessarily.
- Regional Edge Caches have larger storage capacity and longer TTLs than individual edge locations.
- False. ElastiCache is an in-memory, non-persistent store. Data is lost if the node fails unless using Redis persistence features, but even then, it is meant to supplement, not replace, a primary DB.