Amazon Route 53 Routing Policies: A Solutions Architect's Guide
Implementing DNS routing policies (for example, Route 53 latency-based routing, geolocation routing, simple routing)
Amazon Route 53 Routing Policies: A Solutions Architect's Guide
This guide covers the implementation and optimization of DNS routing policies within Amazon Route 53, focusing on how to design highly available and performant global architectures as required for the AWS Certified Solutions Architect - Professional (SAP-C02) exam.
Learning Objectives
- Differentiate between the various Route 53 routing policies (Simple, Weighted, Latency, Geolocation, Geoproximity, Failover, Multi-value).
- Implement latency-based routing to minimize network round-trip time (RTT) for global users.
- Design disaster recovery strategies using health checks and failover routing.
- Evaluate business requirements to select the optimal policy for data residency (Geolocation) and performance (Latency).
Key Terms & Glossary
- Anycast: A network addressing and routing method where requests are routed to the "nearest" or "best" destination as determined by the routing topology.
- DNS Resolver: A server that acts as an intermediary between a user's computer and the authoritative DNS server.
- TTL (Time to Live): The duration (in seconds) that a DNS record is cached by resolvers before it must be refreshed from the authoritative server.
- Alias Record: An AWS-specific record type that points to AWS resources (like an ELB or S3 bucket) and automatically updates if the resource's IP changes.
- Health Check: A mechanism where Route 53 monitors the health of a resource; if the resource is unhealthy, Route 53 stops directing traffic to it.
The "Big Idea"
Amazon Route 53 is not merely a static "phonebook" for the internet; it is a globally distributed traffic management engine. By leveraging Anycast technology and sophisticated routing algorithms, it allows architects to move logic from the application layer to the network layer. This ensures that users are directed to the healthiest, closest, or most relevant endpoint before a single application-level packet is even sent.
Formula / Concept Box
| Routing Policy | Primary Driver | Best Use Case |
|---|---|---|
| Simple | Single Resource | Default for one-to-one IP mapping. |
| Weighted | Percentage (%) | A/B testing or gradual migration (Blue/Green). |
| Latency | Speed (ms) | Global apps where RTT is critical. |
| Failover | Health Status | Active-Passive disaster recovery. |
| Geolocation | User Location | Content localization or GDPR compliance. |
| Geoproximity | Proximity + Bias | Shifting traffic between regions based on resource load. |
| Multi-value | Random Health | Simple load balancing with up to 8 healthy records. |
Hierarchical Outline
- Fundamental Routing
- Simple Routing: Single resource for a record name.
- Weighted Routing: Splitting traffic based on assigned weights.
- Performance-Based Routing
- Latency-Based Routing: Uses AWS internal latency measurements to route to the region with lowest RTT.
- Anycast Technology: Ensures requests reach the Route 53 edge location fastest.
- Location-Based Routing
- Geolocation: Based on Continent, Country, or US State.
- Geoproximity: Based on physical distance + optional "bias" for traffic shaping.
- Resiliency & Health
- Failover Routing: Primary/Secondary configuration.
- Health Checks: Integration with CloudWatch to trigger DNS shifts.
Visual Anchors
Logic Flow: Choosing a Policy
Spatial Representation: Latency vs. Geolocation
\begin{tikzpicture}[scale=0.8] % Draw Globe representation \draw (0,0) circle (3cm); \node at (0,3.5) {Global Infrastructure};
% Regions \draw[fill=blue!20] (-1.5, 1) circle (0.5cm) node[below=0.4cm] {us-east-1}; \draw[fill=blue!20] (1.5, -0.5) circle (0.5cm) node[below=0.4cm] {eu-west-1};
% User in NYC \node (user) at (-3.5, 2) {\textbf{User (NYC)}}; \draw[->, thick, red] (user) -- (-1.5, 1) node[midway, above left] {20ms (Latency)};
% User in Paris \node (user2) at (3.5, 1) {\textbf{User (Paris)}}; \draw[->, thick, green!60!black] (user2) -- (1.5, -0.5) node[midway, above right] {15ms (Latency)};
% Legend \draw (-4,-3) rectangle (4,-4.5); \node at (0, -3.5) {Route 53 computes lowest RTT periodically}; \node at (0, -4.1) {Red/Green arrows = Optimized Traffic Path}; \end{tikzpicture}
Definition-Example Pairs
- Weighted Routing: A policy where multiple resources are associated with one domain name, and traffic is distributed based on numerical weights.
- Example: Assigning a weight of 90 to an existing app and 10 to a new version to perform a Canary Release.
- Failover Routing: A policy used for active-passive failover where one resource is primary and the other is a backup.
- Example: Routing users to a static S3 website if the primary Application Load Balancer in the main region fails its health check.
- Geoproximity Routing: Routing based on the geographic location of users and resources, with an optional "bias" to expand/shrink the size of the region.
- Example: An enterprise wants to route 20% more traffic to a larger data center in Ireland, even if some UK users are technically closer to a smaller London data center.
Worked Examples
Scenario: Configuring Multi-Region Failover
Goal: Ensure api.example.com is highly available across us-east-1 (Primary) and us-west-2 (Secondary).
- Create Health Check: Set up a Route 53 health check that monitors the endpoint in
us-east-1(e.g., checking/healthon the ALB). - Primary Record: Create an Alias record for
api.example.compointing to theus-east-1ALB. Set Routing Policy to Failover, type Primary. Associate with the health check created in step 1. - Secondary Record: Create an Alias record for
api.example.compointing to theus-west-2ALB. Set Routing Policy to Failover, type Secondary. - Validation: If the
us-east-1health check fails, Route 53 will automatically stop returning the Primary record and start returning the Secondary record to DNS resolvers.
Checkpoint Questions
- Which routing policy should be used if you want to route traffic to a resource in the region with the fastest response time?
- True or False: Simple routing supports health checks for individual IP addresses.
- How does Geolocation routing differ from Geoproximity routing?
- What is the maximum number of healthy records a Multi-value answer policy can return?
Muddy Points & Cross-Refs
- Latency vs. Geolocation: A user in France might have lower latency to a US-East server than a German server due to fiber paths. Latency follows physics (speed); Geolocation follows maps (borders). Use Geolocation for language/legal compliance; use Latency for speed.
- CNAME vs. Alias: Use Alias for the root domain (
example.com) and for pointing to AWS resources to save on DNS query costs. CNAMEs cannot be used at the root level. - Health Check Propagation: It can take up to several minutes for a failover to complete due to DNS caching (TTL) at the resolver level, even if Route 53 detects a failure in seconds.
Comparison Tables
Performance vs. Compliance
| Feature | Latency Routing | Geolocation Routing |
|---|---|---|
| Primary Goal | Performance (Low RTT) | Localization / Compliance |
| Mechanism | Measured AWS latency data | User's IP address database |
| Common Use | Gaming, High-freq apps | Localized pricing, GDPR |
| Consistency | May change as network paths flux | Very consistent based on IP |
[!IMPORTANT] For the Professional exam, remember that Global Accelerator (Anycast IP) and Route 53 (DNS) are often used together. Route 53 is the first entry point for DNS resolution, while Global Accelerator optimizes the actual data path over the AWS private network.