Mastering Database Migration: Strategies for Homogeneous and Heterogeneous Environments

Database migration is the process of moving data and schemas from a source location to a target environment, often involving changes in the underlying database engine. This guide explores the tools and methodologies for performing these transitions with minimal downtime and high reliability.

Learning Objectives

Identify the core components and architecture of AWS Database Migration Service (DMS).
Differentiate between homogeneous and heterogeneous migrations.
Select appropriate tools (SCT vs. DMS) based on the migration scenario.
Determine when network limitations necessitate physical transfer methods like Snowball Edge.
Assess connectivity requirements including VPN and Direct Connect for migration traffic.

Key Terms & Glossary

AWS DMS (Database Migration Service): A managed service used to migrate databases to AWS quickly and securely while the source database remains operational.
Replication Instance: An EC2-based engine that performs the heavy lifting of a DMS task, connecting to the source and target.
Endpoints: The connection information (host, port, credentials) for the source and target databases in a DMS task.
Homogeneous Migration: Moving data between the same database engines (e.g., MySQL to MySQL).
Heterogeneous Migration: Moving data between different database engines (e.g., Oracle to Amazon Aurora).
SCT (Schema Conversion Tool): A standalone tool used to convert source database schemas (tables, views, stored procedures) to a format compatible with a different target engine.

The "Big Idea"

Think of database migration as moving a beating heart. You cannot simply stop the heart (the database) for a long time while the rest of the body (the application) is alive. AWS DMS acts as a "bypass machine" that synchronizes data in real-time, allowing you to switch the "blood flow" to the new database with only a few seconds or minutes of downtime.

Formula / Concept Box

Migration Feature	Homogeneous	Heterogeneous
Engine Change	No (Same Engine)	Yes (Different Engine)
Primary Tool	AWS DMS	AWS DMS + SCT
Schema Change	Minimal/None	Significant Transformation
Complexity	Lower	Higher
Example	PostgreSQL to Amazon RDS PostgreSQL	Microsoft SQL Server to Amazon Aurora

Hierarchical Outline

Phase 1: Discovery and Planning
- AWS Application Discovery Service: Mapping dependencies and server utilization.
- AWS Migration Hub: Centralized tracking of the migration progress.
Phase 2: Schema Conversion (If Heterogeneous)
- SCT Application: Used to convert code, stored procedures, and complex schemas.
- Manual Refactoring: Addressing incompatible database features.
Phase 3: Data Transfer (DMS)
- Full Load: Moving the initial bulk of data.
- Change Data Capture (CDC): Replicating ongoing changes from the source to keep the target current.
Phase 4: Network and Logistics
- Connectivity: Choosing between Site-to-Site VPN (1.25 Gbps) or Direct Connect (1-100 Gbps).
- Large Scale Moves: Utilizing Snowball Edge for multi-terabyte datasets where network bandwidth is insufficient.

Visual Anchors

DMS Migration Workflow

Loading Diagram...

Migration Connectivity Decision

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Definition-Example Pairs

Change Data Capture (CDC): The process of tracking and replicating changes (inserts, updates, deletes) in real-time.
- Example: A customer updates their profile on the legacy database; CDC ensures that change is reflected in the new AWS database immediately before the final cutover.
Source/Target Endpoints: The specific connection parameters used by DMS to communicate with the databases.
- Example: Setting a source endpoint to an on-premises Oracle SID and a target endpoint to an Amazon Aurora cluster endpoint.
Lift and Shift (Rehosting): Moving an application/database with zero or minimal changes.
- Example: Moving a SQL Server database running on a local virtual machine directly to SQL Server on Amazon EC2.

Worked Examples

Scenario: Migrating Oracle on-prem to Amazon Aurora PostgreSQL

Analyze Compatibility: Run the AWS SCT against the Oracle source to see which stored procedures and functions can be automatically converted.
Schema Conversion: Use SCT to create the equivalent PostgreSQL schema in the Amazon Aurora instance.
DMS Configuration:
- Provision a Replication Instance in the VPC where Aurora resides.
- Configure the Oracle DB as the Source Endpoint.
- Configure Aurora as the Target Endpoint.
Task Execution: Start a DMS task for "Full load plus CDC". This migrates the data and stays in sync.
Cutover: Once the "Latency" metric in DMS hits zero, stop the application, perform a final check, and point the application to the Aurora endpoint.

Checkpoint Questions

Which tool is strictly required when migrating from a relational Oracle database to a non-relational DynamoDB table?
- Answer: AWS SCT (for schema/structure mapping) and AWS DMS (for data movement).
If you need to query data stored in S3 files using SQL without actually importing that data into a Redshift cluster, what service should you use?
- Answer: Redshift Spectrum.
True or False: The AWS Replication Agent must be installed on the source database server for AWS DMS to work.
- Answer: False. DMS only requires network access and endpoints; Replication Agents are for Application Migration Service (server-level migrations).
What is the recommended solution if you need to migrate a 50TB database but only have a 100Mbps upload speed?
- Answer: AWS Snowball Edge (Physical transfer device).

[!IMPORTANT] Ensure the Replication Instance has enough CPU and Memory to handle the data volume, especially if performing transformations during a heterogeneous migration.

[!TIP] Always verify compliance and encryption requirements (at-rest and in-transit) via AWS Artifact before starting a database migration involving sensitive PII data.

Learning Objectives

Identify the core components and architecture of AWS Database Migration Service (DMS).

Differentiate between homogeneous and heterogeneous migrations.

Select appropriate tools (SCT vs. DMS) based on the migration scenario.

Determine when network limitations necessitate physical transfer methods like Snowball Edge.

Assess connectivity requirements including VPN and Direct Connect for migration traffic.

Key Terms & Glossary

AWS DMS (Database Migration Service): A managed service used to migrate databases to AWS quickly and securely while the source database remains operational.

Replication Instance: An EC2-based engine that performs the heavy lifting of a DMS task, connecting to the source and target.

Endpoints: The connection information (host, port, credentials) for the source and target databases in a DMS task.

Homogeneous Migration: Moving data between the same database engines (e.g., MySQL to MySQL).

Heterogeneous Migration: Moving data between different database engines (e.g., Oracle to Amazon Aurora).

SCT (Schema Conversion Tool): A standalone tool used to convert source database schemas (tables, views, stored procedures) to a format compatible with a different target engine.

The "Big Idea"

Migration Feature

Homogeneous

Heterogeneous

Engine Change

No (Same Engine)

Yes (Different Engine)

Primary Tool

AWS DMS

AWS DMS + SCT

Schema Change

Minimal/None

Significant Transformation

Complexity

Lower

Higher

Example

PostgreSQL to Amazon RDS PostgreSQL

Microsoft SQL Server to Amazon Aurora

Hierarchical Outline

Phase 1: Discovery and Planning

AWS Application Discovery Service: Mapping dependencies and server utilization.
AWS Migration Hub: Centralized tracking of the migration progress.

Phase 2: Schema Conversion (If Heterogeneous)

SCT Application: Used to convert code, stored procedures, and complex schemas.
Manual Refactoring: Addressing incompatible database features.

Phase 3: Data Transfer (DMS)

Full Load: Moving the initial bulk of data.
Change Data Capture (CDC): Replicating ongoing changes from the source to keep the target current.

Phase 4: Network and Logistics

Connectivity: Choosing between Site-to-Site VPN (1.25 Gbps) or Direct Connect (1-100 Gbps).
Large Scale Moves: Utilizing Snowball Edge for multi-terabyte datasets where network bandwidth is insufficient.

Definition-Example Pairs

Change Data Capture (CDC): The process of tracking and replicating changes (inserts, updates, deletes) in real-time.

Example: A customer updates their profile on the legacy database; CDC ensures that change is reflected in the new AWS database immediately before the final cutover.

Source/Target Endpoints: The specific connection parameters used by DMS to communicate with the databases.

Example: Setting a source endpoint to an on-premises Oracle SID and a target endpoint to an Amazon Aurora cluster endpoint.

Lift and Shift (Rehosting): Moving an application/database with zero or minimal changes.

Example: Moving a SQL Server database running on a local virtual machine directly to SQL Server on Amazon EC2.

Worked Examples

Scenario: Migrating Oracle on-prem to Amazon Aurora PostgreSQL

Analyze Compatibility: Run the AWS SCT against the Oracle source to see which stored procedures and functions can be automatically converted.

Schema Conversion: Use SCT to create the equivalent PostgreSQL schema in the Amazon Aurora instance.

DMS Configuration:

Provision a Replication Instance in the VPC where Aurora resides.
Configure the Oracle DB as the Source Endpoint.
Configure Aurora as the Target Endpoint.

Task Execution: Start a DMS task for "Full load plus CDC". This migrates the data and stays in sync.

Cutover: Once the "Latency" metric in DMS hits zero, stop the application, perform a final check, and point the application to the Aurora endpoint.

Checkpoint Questions

Which tool is strictly required when migrating from a relational Oracle database to a non-relational DynamoDB table?

Answer: AWS SCT (for schema/structure mapping) and AWS DMS (for data movement).

If you need to query data stored in S3 files using SQL without actually importing that data into a Redshift cluster, what service should you use?

Answer: Redshift Spectrum.

True or False: The AWS Replication Agent must be installed on the source database server for AWS DMS to work.

Answer: False. DMS only requires network access and endpoints; Replication Agents are for Application Migration Service (server-level migrations).

What is the recommended solution if you need to migrate a 50TB database but only have a 100Mbps upload speed?

Answer: AWS Snowball Edge (Physical transfer device).

[!IMPORTANT] Ensure the Replication Instance has enough CPU and Memory to handle the data volume, especially if performing transformations during a heterogeneous migration.

[!TIP] Always verify compliance and encryption requirements (at-rest and in-transit) via AWS Artifact before starting a database migration involving sensitive PII data.