AWS Data Caching Services: ElastiCache and CloudFront

This study guide covers the implementation and optimization of data caching services on AWS, focusing on Amazon ElastiCache and Amazon CloudFront. Caching is a critical skill for the AWS Certified Developer - Associate (DVA-C02) exam, particularly for improving application performance and reducing database load.

Learning Objectives

Distinguish between Amazon ElastiCache and Amazon CloudFront use cases.
Compare the features and limitations of Redis vs. Memcached engines.
Implement caching strategies including Lazy Loading and Write-through.
Understand the role of Edge Locations and Regional Edge Caches in content delivery.

Key Terms & Glossary

Cache Hit: When the requested data is found in the cache, resulting in fast retrieval.
Cache Miss: When data is not in the cache, forcing the application to fetch it from the origin/database.
TTL (Time-to-Live): A setting that determines how long an item remains in the cache before it expires.
Origin: The primary source of data (e.g., an S3 bucket or an EC2 instance) that the cache retrieves data from.
In-Memory: A data storage method that keeps data in the RAM for sub-millisecond latency.

The "Big Idea"

Caching is the art of trading memory for speed. By storing frequently accessed data or expensive computation results in a high-speed, in-memory data layer, you offload heavy read traffic from your primary databases (like RDS) and reduce latency for end-users. In AWS, this happens at the application level (ElastiCache) and the network level (CloudFront).

Formula / Concept Box

ElastiCache Engine Comparison

Feature	Memcached	Redis
Data Types	Simple (Strings/Objects)	Complex (Lists, Sets, Sorted Sets, Hashes)
High Availability	No (Multi-node only)	Yes (Multi-AZ with Auto-Failover)
Persistence	No	Yes (AOF and Snapshots)
Scaling	Scale out/in (Add/Remove nodes)	Scale up/down or Resharding
Backup/Restore	No	Yes

Hierarchical Outline

I. Amazon ElastiCache
- Purpose: In-memory key-value store to offload relational databases.
- Use Cases: Session stores, recommendation engines, frequent SQL query results.
II. Caching Strategies
- Lazy Loading: Data loaded only on a miss; avoids cache churn.
- Write-Through: Data written to cache and DB simultaneously; ensures data freshness.
III. Amazon CloudFront
- Edge Locations: Physical sites that cache content closer to users.
- Regional Edge Caches: Mid-tier caches with larger capacity than standard edge locations.

Visual Anchors

Cache Strategy: Lazy Loading Flow

Loading Diagram...

Caching Architecture Overview

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Definition-Example Pairs

Lazy Loading: A strategy where data is only cached when it is actually requested.
- Example: A news site only caches an article once the first user clicks on it, preventing the cache from filling up with articles no one reads.
Session State: Data about a user's active session on a website.
- Example: A shopping cart's contents are stored in a Redis cluster so that if the web server restarts, the user doesn't lose their items.
Content Delivery Network (CDN): A system of distributed servers that deliver content based on the geographic locations of the user.
- Example: A user in Tokyo downloads a video from a CloudFront edge location in Japan rather than fetching it from an S3 bucket in Virginia.

Worked Examples

Scenario: Handling Stale Data in Lazy Loading

Problem: You use Lazy Loading for a product catalog. A product's price changes in the RDS database, but users still see the old price in the app.

Step-by-Step Resolution:

Identify the Issue: The cache has no way of knowing the DB changed (Cache Invalidation problem).
Apply TTL: Implement a Time-to-Live (e.g., 300 seconds). After 5 minutes, the record expires automatically.
Manual Eviction: Update the application code so that when an UpdateProduct API call is made, it also calls cache.delete(product_id).
Result: The next request will result in a Cache Miss, fetching the new price from the DB and updating the cache.

Checkpoint Questions

Which ElastiCache engine should you choose if you require multi-AZ replication and automatic failover?
What is the primary disadvantage of the "Write-through" caching strategy regarding resource usage?
How does a Regional Edge Cache differ from a standard Edge Location in CloudFront?
[!IMPORTANT] True or False: Amazon ElastiCache is a persistent data store intended to replace RDS for long-term storage.

▶Click to view answers

Redis. Memcached does not support high availability or automatic failover.
Resource Wastage. Most data written to the cache may never be read, consuming memory unnecessarily.
Regional Edge Caches have larger storage capacity and longer TTLs than individual edge locations.
False. ElastiCache is an in-memory, non-persistent store. Data is lost if the node fails unless using Redis persistence features, but even then, it is meant to supplement, not replace, a primary DB.