AWS Data Transfer and Migration Strategy Study Guide
Selecting the appropriate data transfer service and migration strategy
AWS Data Transfer and Migration Strategy Study Guide
This guide covers critical components for the AWS Certified Solutions Architect - Professional (SAP-C02) exam, focusing on selecting the right tools and strategies for moving workloads and data to the cloud.
Learning Objectives
By the end of this study guide, you should be able to:
- Identify the four phases of the AWS migration process (Assess, Mobilize, Migrate, Modernize).
- Evaluate applications according to the "7Rs" migration strategies.
- Select the appropriate AWS Storage Gateway type based on protocol and latency requirements.
- Differentiate between AWS DataSync and other online transfer mechanisms.
- Determine when to use AWS DMS and SCT for database migrations.
Key Terms & Glossary
- SCT (Schema Conversion Tool): An AWS tool used to convert a source database schema to a format compatible with a different target engine (e.g., Oracle to Aurora).
- DMS (Database Migration Service): A service that helps migrate databases to AWS quickly and securely while the source database remains functional.
- VTL (Virtual Tape Library): A data storage virtualization technology used for backup and recovery, mimicking physical tape infrastructure.
- iSCSI: An IP-based storage networking standard for linking data storage facilities, used by AWS Volume and Tape Gateways.
- DataSync: An online data transfer service that simplifies and accelerates moving data between on-premises and AWS storage.
The "Big Idea"
Migration is not a single event but a structured architectural transition. It requires a shift from simple "lift-and-shift" thinking to a phased approach where tools like AWS DataSync, Storage Gateway, and Snow Family are chosen based on data volume, available bandwidth, and time constraints. Success is measured by how well the strategy minimizes downtime while maximizing cloud-native benefits.
Formula / Concept Box
| Feature | AWS DataSync | AWS Storage Gateway | AWS Snow Family |
|---|---|---|---|
| Primary Use | High-speed data migration/sync | Hybrid cloud storage/cached access | Massive offline data migration |
| Connectivity | Online (Internet/Direct Connect) | Online (Persistent connection) | Offline (Physical shipment) |
| Data Type | Files, Objects, HDFS | Files, Blocks, Tapes | Files, Blocks |
| Best For | Recurring sync/One-time large migrations | Extending on-prem storage to S3 | Terabyte to Petabyte migrations with low bandwidth |
Hierarchical Outline
- I. Migration Phases
- Assess: Inventory and TCO analysis.
- Mobilize: Detailed planning and building the "Landing Zone."
- Migrate & Modernize: Execution and post-migration optimization.
- II. The 7Rs Strategies
- Rehost: "Lift and Shift" (Manual or AWS MGN).
- Replatform: "Lift and Reshape" (Small optimizations like moving to RDS).
- Refactor/Re-architect: Full rewrite to cloud-native.
- Relocate: VMware Cloud on AWS transfer.
- Repurchase: Moving to a SaaS model.
- Retain: Keeping it on-premises for now.
- Retire: Decommissioning the application.
- III. AWS Storage Gateway Family
- S3 File Gateway: NFS/SMB interface to S3 objects.
- FSx File Gateway: Low-latency SMB access to FSx for Windows.
- Volume Gateway: Block storage (Cached vs. Stored mode).
- Tape Gateway: Virtual Tape Library for backup apps.
Visual Anchors
Migration Tool Decision Tree
Storage Gateway Architecture
\begin{tikzpicture}[node distance=2cm] \draw[thick] (0,0) rectangle (3,2) node[pos=.5, text width=2cm, align=center] {On-Prem App$NFS/SMB/iSCSI)}; \draw[->, thick] (3,1) -- (5,1) node[midway, above] {Local Cache}; \draw[thick] (5,0) rectangle (8,2) node[pos=.5, text width=2.5cm, align=center] {AWS Storage\Gateway}; \draw[->, thick] (8,1) -- (10,1) node[midway, above] {SSL/TLS}; \draw[thick] (10,0) rectangle (13,2) node[pos=.5, text width=2cm, align=center] {Amazon S3 /\Amazon FSx}; \draw[dashed] (4,-0.5) -- (4,2.5) node[below] {Data Center Boundary}; \end{tikzpicture}
Definition-Example Pairs
- Cached Volume Gateway: The entire dataset is in S3, and only frequently accessed data is kept locally.
- Example: A law firm stores 50TB of archive records in S3 but needs sub-millisecond access to the current month's files on their local network.
- Stored Volume Gateway: The primary dataset is stored locally, with asynchronous backups to S3 EBS snapshots.
- Example: A manufacturing plant requires 100% local availability for production logs but wants off-site disaster recovery in AWS.
- AWS SCT (Schema Conversion Tool): Automates the conversion of database schemas between different engines.
- Example: Converting a proprietary SQL Server schema with complex stored procedures to a PostgreSQL-compatible Aurora schema.
Worked Examples
Scenario: Large-Scale Migration with Limited Bandwidth
Problem: A media company needs to migrate 400TB of video assets to Amazon S3. They have a 100 Mbps internet connection which is currently 60% utilized by production traffic.
Solution Breakdown:
- Bandwidth Check: 100 Mbps * 40% availability = 40 Mbps. To move 400TB at 40 Mbps would take ~2.5 years.
- Constraint: Online transfer (DataSync/Direct Connect) is not feasible for the deadline.
- Recommendation: Order 5 AWS Snowball Edge Storage Optimized devices (80TB usable each). Ship the data physically to AWS.
Checkpoint Questions
- What is the primary difference between S3 File Gateway and Volume Gateway?
- Which migration strategy (7Rs) involves moving to a managed service like Amazon RDS without changing the application's core architecture?
- When should you use AWS SCT before AWS DMS?
- Which Storage Gateway type is best for replacing physical tape backups?
[!TIP] Answer Key: 1. File Gateway treats files as objects in S3; Volume Gateway treats data as block storage volumes (iSCSI). 2. Replatforming. 3. When the source and target database engines are different (heterogeneous migration). 4. Tape Gateway.
Muddy Points & Cross-Refs
- DataSync vs. S3 Transfer Acceleration: Use DataSync for large-scale file migrations or recurring syncs. Use S3 Transfer Acceleration for global end-users to upload individual files to a central bucket via CloudFront edge locations.
- DMS "Ongoing Replication": Remember that DMS can perform "Full Load + CDC" (Change Data Capture). This is the key to near-zero downtime migrations.
- Further Study: Review the "AWS Well-Architected Framework: Reliability Pillar" for disaster recovery patterns involving these tools.
Comparison Tables
Volume Gateway Modes
| Mode | Local Storage Requirement | S3 Role | Use Case |
|---|---|---|---|
| Cached | Small (just for hot data) | Primary storage | Scalable storage for low-latency access |
| Stored | Large (full dataset) | Backup/Snapshot storage | Low-latency local access with DR in cloud |