Study Guide1,145 words

AWS Database Replication: Mastery Guide for DMS and SCT

Configuring data and database replication

AWS Database Replication: Mastery Guide for DMS and SCT

Learning Objectives

By the end of this guide, you should be able to:

  • Explain the primary functions of the AWS Schema Conversion Tool (SCT) and AWS Database Migration Service (DMS).
  • Configure a DMS Replication Instance with appropriate scaling and availability settings.
  • Distinguish between the three types of DMS Migration Tasks (Full Load, Full Load + CDC, and CDC-only).
  • Select the correct target table preparation mode (Do nothing, Drop, or Truncate) based on existing data requirements.
  • Design a high-availability migration strategy using Multi-AZ deployments.

Key Terms & Glossary

  • AWS DMS: A managed service used to migrate databases to AWS quickly and securely while the source database remains operational.
  • SCT (Schema Conversion Tool): A tool used for heterogeneous migrations to convert the source database schema to a format compatible with the target database.
  • Replication Instance: A managed EC2 instance that hosts the replication software and performs the actual data movement.
  • Endpoint: The connection information (host, port, credentials) for the source and target datastores.
  • CDC (Change Data Capture): A process that captures ongoing changes in the source database and applies them to the target in real-time.
  • Homogeneous Migration: Migrating between the same database engines (e.g., Oracle to Oracle).
  • Heterogeneous Migration: Migrating between different database engines (e.g., SQL Server to Amazon Aurora).

The "Big Idea"

Database replication in AWS is not just about moving data; it is about minimizing business disruption. In a modern architecture, downtime is costly. AWS DMS acts as a bridge, allowing you to synchronize data between disparate systems—on-premises to cloud, or cloud-to-cloud—while keeping your applications running. By combining SCT (for structure) and DMS (for data), AWS provides a path to modernize legacy systems into managed, scalable cloud databases with near-zero downtime.

Formula / Concept Box

FeatureConfiguration Rule / Formula
Instance SizingLarge DBs with high concurrency → Use High-Memory Instance Types.
AvailabilityProduction/Critical Migrations → Enable Multi-AZ for the Replication Instance.
ConnectivityReplication Instance → Must have network access to both Source and Target Endpoints.
SCT UsageHeterogeneous Migration (Different Engines) → Use SCT first; Homogeneous → Native tools or DMS.

Hierarchical Outline

  1. Pre-Migration Phase (The Assess Phase)
    • SCT (Schema Conversion Tool): Generates a migration assessment report.
    • Manual vs. Auto: Apply schema changes automatically or manually via script.
  2. DMS Infrastructure Components
    • Replication Instance: The engine of the migration; uses memory for transaction buffering.
    • Endpoints: Securely stores connection strings for source/target.
  3. The Migration Task
    • Migration Types: Full Load, Full Load + CDC, CDC only.
    • Target Preparation: Handling existing tables via Drop, Truncate, or Do Nothing.
    • Monitoring: Using CloudWatch logs for task health.

Visual Anchors

DMS Migration Architecture

Loading Diagram...

High Availability Replication Instance

\begin{tikzpicture}[node distance=2cm, every node/.style={rectangle, draw, rounded corners, minimum width=3cm, minimum height=1cm, align=center}] \draw[dashed, thick] (0,0) rectangle (10,5) node[pos=0.1, above] {VPC}; \draw[dotted] (5,0) -- (5,5) node[pos=0.9, left] {AZ-1} node[pos=0.9, right] {AZ-2}; \node (RI1) at (2.5, 2.5) {Primary\Replication Instance}; \node (RI2) at (7.5, 2.5) {Standby\Replication Instance}; \draw[<->, thick] (RI1) -- (RI2) node[midway, above] {Synchronous Replication}; \node (S3) at (5, -1) {Source Database}; \draw[->, thick] (S3) -- (RI1); \end{tikzpicture}

Definition-Example Pairs

  • CDC (Change Data Capture):
    • Definition: Capturing every insert, update, or delete on the source and applying it to the target.
    • Example: A retail bank migrating its SQL Server to Amazon Aurora; the bank stays open, and new transactions made during the 48-hour migration window are synced to the new DB continuously.
  • Truncate (Target Prep Mode):
    • Definition: Deleting the rows in the target table but keeping the table structure/metadata intact before starting the load.
    • Example: You are restarting a failed migration task; you want the target tables to be empty so you don't get duplicate primary key errors, but you already manually tuned the table indexes and don't want to re-create them.

Worked Examples

Example 1: Minimal Downtime Migration Strategy

Scenario: You have a 5TB MySQL database on-premises and want to migrate to Amazon Aurora MySQL with less than 30 minutes of downtime.

Step-by-Step Solution:

  1. Assessment: Run SCT to ensure compatibility (though homogeneous, SCT provides a useful assessment).
  2. Infrastructure: Create a High-Memory Multi-AZ Replication Instance in AWS.
  3. Endpoints: Configure the source (on-prem IP) and target (Aurora endpoint).
  4. Task Creation: Select "Migrate existing data and replicate ongoing changes" (Full Load + CDC).
  5. Execution: The task performs a Full Load first. Once done, it enters the CDC phase, where it keeps Aurora in sync with on-prem.
  6. Cutover: When the "Time lag" in CloudWatch is near zero, stop application traffic, let the last few changes sync, and point the app to Aurora.

Checkpoint Questions

  1. Which AWS tool is used to generate a report showing potential incompatibilities between an Oracle source and a PostgreSQL target?
  2. What replication instance setting is required to ensure that a migration task can automatically recover if an Availability Zone fails?
  3. If you want to keep your source and target databases in sync indefinitely without moving existing historical data, which migration type should you use?
  4. What happens if you select the "Drop tables on target" option and the tables do not exist yet?

Muddy Points & Cross-Refs

  • Memory Pressure: A common "muddy point" is why DMS tasks fail. If the replication instance is too small, large transactions (like LOBs - Large Objects) can exhaust memory. Always monitor FreeableMemory in CloudWatch.
  • SCT vs. DMS: Remember: SCT is for the "Container" (Schema/Tables/Procedures), DMS is for the "Content" (the actual data rows).
  • Network Performance: If migrations are slow, check the bandwidth between the source and the Replication Instance. For large migrations, consider using AWS Direct Connect or AWS Snowball Edge (which now integrates with DMS).

Comparison Tables

Target Table Preparation Modes

ModeAction on TargetBest Use Case
Do NothingLeaves existing tables alone.Target tables are already manually created and pre-filled with some data.
DropDrops and re-creates tables.Initial testing phases where you want a completely fresh start.
TruncateDeletes rows, keeps table structure.Retesting a migration where indexes/metadata are already optimized on target.

Migration Task Types

Task TypeDescriptionDowntime Window
Full LoadCopies all data once.High (App must be off to avoid data loss).
Full Load + CDCCopies data then captures changes.Minimal (Only during the final cutover).
CDC OnlyOnly copies changes since a specific time.Low (Used for syncing or DR scenarios).

Ready to study AWS Certified Solutions Architect - Professional (SAP-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free