Study Guide920 words

AWS Data Migration: Online and Offline Strategies

Data migration options and tools (for example, AWS DataSync, AWS Transfer Family, AWS Snow Family, Amazon S3 Transfer Acceleration)

AWS Data Migration: Online and Offline Strategies

This guide explores the mechanisms provided by AWS to migrate data efficiently, securely, and cost-effectively, covering both online network-based tools and offline physical transport devices.

Learning Objectives

After studying this guide, you should be able to:

  • Differentiate between online and offline data migration tools.
  • Select the appropriate AWS service based on data volume and connectivity constraints.
  • Explain how AWS S3 Transfer Acceleration optimizes global data uploads.
  • Identify the security methods (encryption) used by different migration tools.
  • Choose between Snowball Edge and Snowmobile for large-scale data movements.

Key Terms & Glossary

  • Edge Location: A site that CloudFront uses to cache copies of your content for faster delivery to users at any location. Used by S3 Transfer Acceleration.
  • Exabyte-scale: Data volumes equivalent to 1,000 Petabytes (PB). Handled by AWS Snowmobile.
  • Encryption at Rest: Protection of data while it is stored on a disk or device. AWS Snow devices use 256-bit keys for this.
  • TLS (Transport Layer Security): A cryptographic protocol designed to provide communications security over a computer network.
  • ETL (Extract, Transform, Load): A three-phase process where data is extracted, transformed into a proper format, and loaded into a final target.

The "Big Idea"

Data migration is not a "one size fits all" task. It requires a balance between Available Bandwidth, Data Volume, and Time. When your internet connection is too slow to move petabytes of data in a reasonable timeframe, you shift from the "Online" lane to the "Offline" lane (Snow Family). If your data is distributed globally, you leverage the AWS global network (Transfer Acceleration) to bypass the congested public internet.

Formula / Concept Box

Migration VariableDecision Driver
Data Volume < 10 TBOnline tools (DataSync, S3 TA) are usually more cost-effective.
Data Volume > 10 TBStart considering the AWS Snow Family.
Frequent / OngoingUse AWS DataSync or AWS Storage Gateway.
Streaming / Real-timeUse Amazon Kinesis Data Firehose.
Legacy ProtocolsUse AWS Transfer Family (SFTP, FTPS, FTP, AS2).

Hierarchical Outline

  • Online Migration Tools
    • AWS DataSync: High-speed, automated data transfer for ongoing or one-time migrations between on-premises and AWS.
    • AWS Transfer Family: Managed service for SFTP/FTPS/FTP/AS2 protocols directly into S3 or EFS.
    • S3 Transfer Acceleration: Optimizes uploads via CloudFront Edge Locations for geographically dispersed users.
    • Kinesis Data Firehose: Managed streaming delivery for IoT, logs, and social media data.
  • Offline Migration Tools (Snow Family)
    • AWS Snowcone: Small, portable (8 TB usable); for edge computing and small migrations.
    • AWS Snowball Edge: Petabyte-scale (up to 80 TB usable); includes compute capabilities (S3 and EC2 compatibility).
    • AWS Snowmobile: Exabyte-scale (up to 100 PB per container); 40-foot shipping container for massive data center moves.

Visual Anchors

Migration Decision Flow

Loading Diagram...

S3 Transfer Acceleration vs. Standard Internet

\begin{tikzpicture}[node distance=2cm, auto] \draw[thick,->] (0,3) -- (8,3) node[midway, above] {\textbf{Standard Internet (High Latency/Jitter)}}; \draw[thick,blue,->] (0,1) -- (2,1) node[midway, below] {\textbf{User}}; \draw[fill=blue!20] (2,0.5) rectangle (4,1.5) node[midway] {\textbf{Edge Location}}; \draw[thick,blue,->] (4,1) -- (8,1) node[midway, below] {\textbf{AWS Backbone (Fast/Private)}}; \draw[fill=green!20] (8,0.5) rectangle (10,3.5) node[midway, rotate=90] {\textbf{S3 Bucket}}; \end{tikzpicture}

Definition-Example Pairs

  • AWS DataSync: A service that automates and accelerates moving data between on-premises storage and AWS over the network.
    • Example: A media company synchronizing their local NAS with Amazon S3 every night for backup.
  • AWS Snowball Edge: A physical ruggedized device used to transport large amounts of data to AWS.
    • Example: A research station in Antarctica with no internet connectivity shipping 50 TB of climate data back to the US.
  • S3 Transfer Acceleration: A bucket-level feature that enables fast data transfers over long distances.
    • Example: A mobile app in Tokyo uploading high-res photos to an S3 bucket located in Northern Virginia.

Worked Examples

Scenario 1: The Exabyte Migration

Problem: A global financial institution needs to move 150 PB of historical transaction records from an on-premises data center to AWS S3 Glacier. Their outbound internet capacity is 10 Gbps. Solution:

  1. Calculation: At 10 Gbps (theoretical max), 150 PB would take roughly 3-4 years to upload online.
  2. Tool Selection: AWS Snowmobile. Two Snowmobile containers (100 PB each) would be deployed.
  3. Process: Plug the Snowmobile into the local network, transfer data, and ship it back to an AWS Region for secure upload.

Scenario 2: Geographically Dispersed Uploads

Problem: A video production firm has editors in London, Mumbai, and Sydney. All files must be stored in a central S3 bucket in the us-east-1 region. Solution:

  1. Tool Selection: Amazon S3 Transfer Acceleration.
  2. Optimization: Editors upload to the nearest CloudFront Edge location. The data travels over the optimized AWS internal network to the US East region, bypassing internet congestion points.

Checkpoint Questions

  1. At what data volume does AWS suggest Snowball Edge becomes more cost-effective than online methods?
  2. Which service is best for continuous ETL of streaming IoT data into S3?
  3. True/False: AWS DataSync stores and encrypts data at rest within its own service.
  4. How does S3 Transfer Acceleration achieve higher speeds for distant users?

Muddy Points & Cross-Refs

  • Snowball vs. Snowmobile: Remember that Snowmobile is for Exabyte scale (multi-PB). If you have less than 10 PB, AWS recommends a cluster of Snowball Edge devices instead of one Snowmobile for better cost-efficiency.
  • DataSync vs. Transfer Family: DataSync is for high-speed data moving (syncing filesystems). Transfer Family is a protocol interface (SFTP/FTP) for users who need to interact with S3/EFS using legacy tools.
  • Security: All Snow devices use 256-bit encryption. Online tools like DataSync and S3 TA use TLS/SSL for in-transit protection.

Comparison Tables

FeatureAWS DataSyncAWS Snowball EdgeS3 Transfer Acceleration
Transfer TypeOnline (Network)Offline (Physical)Online (Network)
Primary Use CaseRecurring synchronizationOne-time massive migrationFast global uploads to S3
ConnectivityRequires stable bandwidthNo internet requiredPublic Internet + AWS Backbone
ScaleGB to PB (Continuous)PB (Discrete chunks)GB to TB (Global)
SecurityTLS in transitKMS (256-bit) at restSSL/TLS in transit

Ready to study AWS Certified Solutions Architect - Professional (SAP-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free