Mastering AWS Hub-and-Spoke Networking: Transit Gateway and Transit VPC
Configuring a hub-and-spoke network architecture (for example, Transit Gateway, transit VPC)
Mastering AWS Hub-and-Spoke Networking: Transit Gateway and Transit VPC
Learning Objectives
After studying this guide, you should be able to:
- Differentiate between the managed AWS Transit Gateway service and the architectural Transit VPC pattern.
- Explain the scalability benefits of a hub-and-spoke model over a full-mesh VPC peering topology.
- Identify the specific use cases for Transit Gateway Connect and SD-WAN integration.
- Design a multi-account networking strategy using AWS Resource Access Manager (RAM).
- Analyze the performance trade-offs, specifically regarding the 1.25 Gbps VPN tunnel limit.
Key Terms & Glossary
- Hub-and-Spoke: A network topology where a central hub acts as a router for multiple peripheral "spoke" networks.
- Transit Gateway (TGW): A regional, fully managed AWS service that connects VPCs and on-premises networks through a central point.
- Transit VPC: A customer-managed architectural pattern using EC2 instances running third-party software to provide routing between VPCs.
- Transitive Routing: The ability of a network to pass traffic through a middle component (e.g., A can talk to C via B). AWS VPC peering does not support this natively.
- TGW Connect: A feature that enables integration of SD-WAN appliances into AWS via GRE (Generic Routing Encapsulation) and BGP.
- BGP (Border Gateway Protocol): The standard routing protocol used to exchange reachability information between different networks.
The "Big Idea"
In early AWS networking, connecting VPCs required VPC Peering. While effective, peering is a point-to-point connection that does not support transitive routing. If you have connections. As organizations grow, this becomes an unmanageable "spaghetti" of routes. The Hub-and-Spoke model simplifies this by requiring only one connection per VPC to a central hub, reducing complexity and centralizing security and monitoring.
Formula / Concept Box
| Feature | AWS Transit Gateway | Transit VPC (EC2-based) |
|---|---|---|
| Management | AWS Managed (High Availability built-in) | Customer Managed (User responsible for HA) |
| Throughput | Up to 50 Gbps per VPC attachment | Limited to 1.25 Gbps per VPN tunnel |
| Protocols | BGP for VPN/Connect; Static/Propagated | BGP is standard; supports complex overlays |
| Scale | Thousands of VPCs | Limited by EC2 instance performance |
| Scope | Regional (inter-region peering available) | Global (via VPN overlays) |
Hierarchical Outline
- Legacy Networking vs. Hub-and-Spoke
- Full Mesh Peering: High overhead, no transitive routing.
- Centralized Hub: Single management point, simplified routing tables.
- AWS Transit Gateway (TGW)
- Architecture: Regional service; uses attachments to link VPCs, VPNs, and DX Gateways.
- Route Tables: Supports multiple route tables for network isolation (segmentation).
- TGW Connect: Simplifies SD-WAN integration using GRE tunnels.
- Transit VPC Pattern
- Components: EC2 instances, VPN software (Cisco, Palo Alto), and VGWs.
- Advantages: Support for deep packet inspection (DPI), NAT, and multi-vendor consistency.
- Disadvantages: High management overhead, cost of EC2 instances, bandwidth caps.
- Advanced Multi-Account Implementation
- AWS RAM: Sharing a TGW across accounts.
- Monitoring: Using Transit Gateway Network Manager for global visibility.
Visual Anchors
Hub-and-Spoke Topology
Network Complexity Comparison
\begin{tikzpicture}[node distance=2cm] \node[circle,draw,fill=blue!20] (A) {VPC 1}; \node[circle,draw,fill=blue!20,right of=A] (B) {VPC 2}; \node[circle,draw,fill=blue!20,below of=A] (C) {VPC 3}; \node[circle,draw,fill=blue!20,below of=B] (D) {VPC 4};
% Peering lines \draw[<->] (A) -- (B); \draw[<->] (A) -- (C); \draw[<->] (A) -- (D); \draw[<->] (B) -- (C); \draw[<->] (B) -- (D); \draw[<->] (C) -- (D);
\node[below=1cm of C] {Full Mesh (6 links)};
\begin{scope}[xshift=6cm] \node[circle,draw,fill=blue!20] (A1) {VPC 1}; \node[circle,draw,fill=blue!20,right of=A1] (B1) {VPC 2}; \node[rectangle,draw,fill=orange!40,below right=0.7cm and 0.7cm of A1] (H) {TGW}; \node[circle,draw,fill=blue!20,below of=A1, yshift=-1cm] (C1) {VPC 3}; \node[circle,draw,fill=blue!20,below of=B1, yshift=-1cm] (D1) {VPC 4};
\draw[-] (A1) -- (H);
\draw[-] (B1) -- (H);
\draw[-] (C1) -- (H);
\draw[-] (D1) -- (H);
\node[below=1.5cm of C1] {Hub-and-Spoke (4 links)};\end{scope} \end{tikzpicture}
Definition-Example Pairs
- Transitive Routing: The ability for traffic to traverse an intermediate hop to reach a destination.
- Example: VPC A needs to talk to VPC C. VPC A sends traffic to the Transit Gateway, which then routes it to VPC C.
- TGW Attachment: A logical connection between the TGW and a resource.
- Example: Creating an attachment for a production VPC in Account A to allow it to access a shared services VPC in Account B.
- SD-WAN Overlay: A virtual network built on top of physical infrastructure using software control.
- Example: Using Transit Gateway Connect to link a Cisco SD-WAN appliance on-premises directly to the AWS cloud backbone.
Worked Examples
Scenario 1: Migrating from Peering to TGW
Problem: A company has 10 VPCs connected via a full mesh (45 peering connections). Routing tables are reaching their limits.
Solution Steps:
- Deploy TGW: Create a Transit Gateway in the central networking account.
- Share TGW: Use AWS RAM to share the TGW with the other 9 spoke accounts.
- Attach VPCs: Create a VPC attachment in each spoke VPC. Ensure subnets from different AZs are selected for high availability.
- Update Route Tables: Replace individual peering routes (e.g.,
10.1.0.0/16 -> pcx-123) with a single summarized route (e.g.,10.0.0.0/8 -> tgw-456). - Cleanup: Delete the old VPC peering connections once traffic is validated through TGW.
Scenario 2: Overcoming the 1.25 Gbps VPN Limit
Problem: An on-premises data center needs to push 5 Gbps to AWS via a VPN.
Solution:
- You cannot exceed 1.25 Gbps on a single tunnel.
- Solution: Use Equal-Cost Multi-Path (ECMP) routing on the Transit Gateway. Enable dynamic routing (BGP) and establish multiple VPN tunnels (at least 4). TGW will distribute traffic across these tunnels to achieve the aggregate 5 Gbps throughput.
Checkpoint Questions
- Why is a Transit VPC considered a "customer-managed" architecture compared to Transit Gateway?
- What is the default bandwidth limit for a single VPC attachment to a Transit Gateway?
- You need to connect 500 VPCs. Which architecture is more appropriate: VPC Peering or Transit Gateway?
- How does AWS Resource Access Manager (RAM) facilitate multi-account TGW deployments?
- If two VPCs have overlapping IP ranges, can they be connected via a standard Transit Gateway attachment? (Hint: Think about NAT).
Muddy Points & Cross-Refs
- Overlapping IPs: Transit Gateway does not magically fix overlapping CIDRs. If Spoke A and Spoke B both use
10.0.0.0/16, the TGW route table cannot distinguish between them. You would need to use PrivateLink or a Transit VPC with NAT instances to resolve this. - Cost: TGW has an hourly charge per attachment plus a data processing charge ($/GB). For very high traffic between two specific VPCs, VPC Peering may be cheaper as it has no data processing fees.
- Global Reach: While TGW is regional, you can peer TGWs across regions. Transit VPCs, by contrast, can build a global mesh using VPN tunnels over the internet or DX, though with higher latency and management.
Comparison Tables
Routing Strategy Comparison
| Metric | VPC Peering | Transit Gateway | Transit VPC |
|---|---|---|---|
| Transitive Routing | No | Yes | Yes |
| Edge Consolidation | No | Yes (VPN/DX) | Yes (VPN) |
| Security Inspection | Distributed (SG/NACL) | Centralized (via Appliance VPC) | Centralized (on EC2) |
| Complexity | |||
| Third-Party Features | Native only | TGW Connect for SD-WAN | Full Vendor Software |
[!TIP] Use Transit Gateway Network Manager to visualize your entire global network, including on-premises branches connected via SD-WAN. It provides a "single pane of glass" for troubleshooting connectivity issues.