Mastering Hybrid Connectivity: Connecting On-Premises to AWS
Configuring existing on-premises networks to connect with the AWS Cloud
Mastering Hybrid Connectivity: Connecting On-Premises to AWS
This guide covers the critical infrastructure and architectural decisions required to bridge existing on-premises data centers with the AWS Cloud. We explore physical requirements, logical routing with BGP, and the services that facilitate a seamless hybrid network.
Learning Objectives
After studying this guide, you should be able to:
- Design redundant hybrid connectivity models using AWS Direct Connect (DX) and Site-to-Site VPN.
- Configure physical and logical requirements for DX, including LOA/CFA and BGP peering.
- Implement hub-and-spoke architectures using AWS Transit Gateway and Direct Connect Gateway.
- Resolve DNS queries across hybrid environments using Route 53 Resolvers and Conditional Forwarding.
- Optimize network performance through jumbo frames and BGP attribute manipulation.
Key Terms & Glossary
- ASN (Autonomous System Number): A unique identifier for a network on the internet, used in BGP routing. AWS uses 64512–65534 for private ASNs.
- BGP (Border Gateway Protocol): The standardized exterior gateway protocol designed to exchange routing and reachability information between autonomous systems.
- LOA/CFA (Letter of Authorization / Connecting Facility Assignment): A document providing permission to connect to a provider's port in a colocation facility.
- LAG (Link Aggregation Group): A logical interface that uses the Link Aggregation Control Protocol (LACP) to aggregate multiple physical connections at a single AWS Direct Connect endpoint.
- VIF (Virtual Interface): A logical mapping on a Direct Connect link. Types include Private (to one VPC), Public (to AWS public services), and Transit (to a Transit Gateway).
The "Big Idea"
[!IMPORTANT] The goal of hybrid networking is to treat the AWS Cloud as a seamless extension of your local data center. By combining the low-latency, high-bandwidth nature of Direct Connect with the encryption and rapid deployment of VPNs, organizations can achieve a "single contiguous network" that supports legacy applications and modern cloud-native services simultaneously.
Formula / Concept Box
| Feature | Requirement / Rule |
|---|---|
| BGP ASN | Private Range: 64512 to 65534 (16-bit) or 4200000000 to 4294967294 (32-bit) |
| Maximum MTU | Standard: 1500 bytes; Jumbo Frames: 9001 bytes (Supported on DX and Transit Gateway) |
| VPN Throughput | Up to 1.25 Gbps per tunnel (unless using Accelerated VPN or multiple tunnels with ECMP) |
| Direct Connect Speed | 1 Gbps, 10 Gbps, or 100 Gbps (Dedicated) or 50Mbps to 10Gbps (Hosted) |
Hierarchical Outline
- Physical Connectivity (Layer 1 & 2)
- Colocation Facilities: Selecting a Direct Connect location.
- Hardware Requirements: Single-mode fiber, 1000BASE-LX or 10GBASE-LR.
- 802.1Q VLANs: Separating traffic via VLAN tags on a single physical link.
- Logical Connectivity (Layer 3)
- Direct Connect Gateway (DXGW): Global resource for connecting DX to multiple VPCs across regions.
- Transit Gateway (TGW): Central hub for multi-VPC and hybrid traffic management.
- Static vs. Dynamic Routing: Preference for BGP (Dynamic) for automatic failover.
- Cross-Environment Name Resolution
- Route 53 Inbound Endpoints: Allows on-premises to resolve AWS DNS names.
- Route 53 Outbound Endpoints: Allows AWS to resolve on-premises DNS names via conditional forwarding.
- Security & Monitoring
- IPsec VPN: Encrypting traffic over DX (Public VIF) or the Internet.
- Visibility Tools: Reachability Analyzer and CloudWatch metrics.
Visual Anchors
Hybrid Connectivity Decision Path
Physical to Logical Mapping
\begin{tikzpicture}[node distance=2cm, every node/.style={rectangle, draw, minimum width=3cm, minimum height=1cm, align=center}] \node (OnPrem) {On-Premises\Router}; \node (Colo) [right of=OnPrem, xshift=2cm] {Colocation\Meet-Me Room}; \node (AWSEdge) [right of=Colo, xshift=2cm] {AWS Direct Connect\Endpoint}; \node (VPC) [above of=AWSEdge] {Amazon VPC};
\draw[thick] (OnPrem) -- node[above] {Customer Link} (Colo); \draw[thick] (Colo) -- node[above] {Cross-Connect} (AWSEdge); \draw[dashed, blue] (AWSEdge) -- node[right] {Private VIF / 802.1Q} (VPC);
\node[draw=none, fill=none, below of=Colo, yshift=1cm] {\textit{Layer 1: Physical}}; \node[draw=none, fill=none, below of=AWSEdge, yshift=1cm] {\textit{Layer 2/3: Logical}}; \end{tikzpicture}
Definition-Example Pairs
- Conditional Forwarding: A DNS configuration where specific domain queries are sent to a particular IP address.
- Example: Configuring Route 53 to send all queries for
*.corp.internalto the on-premises DNS server IP10.0.0.50.
- Example: Configuring Route 53 to send all queries for
- BGP Community Tags: Metadata added to BGP routes to influence routing decisions.
- Example: Using the tag
7224:7100to limit route propagation to a specific AWS region to prevent sub-optimal routing (tromboning).
- Example: Using the tag
- Transit VIF: A type of Direct Connect virtual interface used specifically to connect to a Transit Gateway.
- Example: Creating a single Transit VIF on a 10Gbps DX connection to allow 50 different VPCs to communicate with an on-premises data center through a central hub.
Worked Examples
Scenario: Setting up High-Availability Hybrid DNS
Problem: A company needs their AWS Lambda functions to resolve the hostnames of databases sitting in a physical data center.
- Step 1: Create Route 53 Outbound Endpoint. Provision the endpoint in at least two Availability Zones within the VPC.
- Step 2: Security Group Configuration. Allow outbound UDP/TCP port 53 traffic from the endpoint to the on-premises DNS server IPs.
- Step 3: Create Resolver Rule. Select the "Forward" type for the domain
internal.company.comand associate it with the VPC. - Step 4: Target IPs. Enter the on-premises DNS server IP addresses.
- Verification: Run
nslookup db.internal.company.comfrom an EC2 instance in the VPC and check if it returns the on-premises IP.
Checkpoint Questions
- What is the primary document required to request a physical cross-connect in a colocation facility?
- Which BGP attribute does AWS use to prefer one Direct Connect path over another when receiving identical prefixes?
- Why would an architect choose a Transit VIF over a Private VIF?
- How do you ensure encryption for data traveling over an AWS Direct Connect connection?
Muddy Points & Cross-Refs
- DXGW vs. TGW: A common point of confusion. DXGW is a globally redundant control plane that connects DX to multiple VPCs/VGWs. TGW is a regional network hub that handles routing between VPCs and on-premises. You often use them together (DX -> DXGW -> TGW -> VPCs).
- BGP Path Selection: If you have a VPN and DX, AWS will prefer DX because it is a dedicated path. If you have two DX links, AWS uses Local Preference (incoming to AWS) and AS_PATH Prepending (outgoing from AWS) to determine the priority.
- Overlapping IPs: If your on-premises CIDR overlaps with your VPC CIDR, routing will fail. Use PrivateLink or NAT Gateways as a workaround.
Comparison Tables
Direct Connect vs. Site-to-Site VPN
| Feature | Direct Connect | Site-to-Site VPN |
|---|---|---|
| Performance | Consistent, Low Latency | Variable (Internet-based) |
| Bandwidth | 1 Gbps - 100 Gbps | ~1.25 Gbps per tunnel |
| Setup Time | Weeks (Physical install) | Minutes (Software config) |
| Cost | High (Port + Data Transfer) | Low (Hourly + Data Transfer) |
| Encryption | Optional (via MacSec or IPsec) | Mandatory (IPsec) |
Routing Types
| Type | Best For | Pros | Cons |
|---|---|---|---|
| Static | Simple, small networks | Low overhead, predictable | Manual updates required |
| Dynamic (BGP) | Enterprise, complex hybrid | Automatic failover, scalable | Complex to configure |
| CloudHub | Multiple branch offices | Hub-and-spoke over VPN | Higher latency than DX |