Study Guide1,084 words

Identifying Opportunities for Purpose-Built Databases: A Modernization Guide

Identifying opportunities for purpose-built databases

Identifying Opportunities for Purpose-Built Databases: A Modernization Guide

Modern application architecture favors "Polyglot Persistence"—choosing the right database for the specific access patterns of a workload rather than forcing all data into a single relational engine. This guide explores the transition from monolithic legacy databases to purpose-built AWS solutions.

Learning Objectives

  • Distinguish between refactoring (homogeneous/minor changes) and rearchitecting (heterogeneous/major changes) database migrations.
  • Identify the specific use cases for Amazon Aurora, DynamoDB, DocumentDB, and Redshift.
  • Evaluate the level of effort and code changes required for different migration paths.
  • Apply AWS migration tools like DMS, SCT, and Babelfish to the appropriate modernization scenarios.

Key Terms & Glossary

  • Purpose-Built Database: A database designed for a specific data model or workload (e.g., key-value, document, graph) rather than a general-purpose relational model.
  • Refactoring: A migration strategy where the application is moved to a cloud-native or open-source version of its current database (e.g., SQL Server to Aurora PostgreSQL).
  • Rearchitecting: A migration strategy where the data model is changed significantly (e.g., Relational to NoSQL) to gain scalability or performance.
  • Heterogeneous Migration: A migration between different database engines (e.g., Oracle to Amazon Aurora).
  • Schema Conversion Tool (SCT): An AWS tool that automates the conversion of database schemas and application code between different engines.
  • Database Migration Service (DMS): A service that helps migrate databases to AWS quickly and securely, maintaining high availability for applications.

The "Big Idea"

The transition to purpose-built databases is the cornerstone of database modernization. Moving away from proprietary, expensive, and "one-size-fits-all" relational databases allows organizations to save on licensing costs and unlock massive scalability. The core philosophy is to match the data access pattern (how the app reads/writes) to the database engine, rather than the other way around.

Formula / Concept Box

Migration StrategyDatabase TargetData Model ChangeCode EffortPrimary Benefit
RefactorAmazon AuroraMinimalLowLicense Savings / Cloud Native
RearchitectAmazon DynamoDBSQL to Key-ValueHighExtreme Scale / Serverless
RearchitectAmazon DocumentDBSQL to JSONHighFlexible Schema / Dev Velocity
RearchitectAmazon RedshiftRow to ColumnarModerateOLAP / Analytics Performance

Hierarchical Outline

  1. Introduction to Database Modernization
    • The shift from Monolithic SQL to Polyglot Persistence.
    • Drivers: Licensing costs, performance bottlenecks, and operational overhead.
  2. Opportunity 1: Refactoring to Open Source / Cloud Native
    • Targeting Amazon Aurora (MySQL/PostgreSQL compatible).
    • Using Babelfish for Aurora PostgreSQL to minimize SQL Server code changes.
    • Benefits of Aurora Serverless for unpredictable workloads.
  3. Opportunity 2: Rearchitecting for Scale (NoSQL)
    • Amazon DynamoDB: Mapping tables to key-value collections.
    • Amazon DocumentDB: Mapping tables to JSON documents.
    • The cost of modernization: Requires overhauling the data access layer.
  4. Opportunity 3: Modernizing Analytics
    • Moving from transactional SQL to Amazon Redshift.
    • Utilizing SCT Data Extractors for massive data movement.
  5. Migration Tooling
    • AWS DMS: For data replication.
    • AWS SCT: For schema and code conversion.

Visual Anchors

Migration Decision Flow

Loading Diagram...

Polyglot Architecture

Compiling TikZ diagram…
Running TeX engine…
This may take a few seconds

Definition-Example Pairs

  • Refactor to Cloud Native: Moving a proprietary database to a managed cloud-native version with higher performance.
    • Example: Migrating a commercial SQL Server instance to Amazon Aurora PostgreSQL to utilize serverless scaling and eliminate license fees.
  • Key-Value Rearchitecting: Converting normalized tables into a flat, key-value structure for sub-millisecond response times.
    • Example: Moving a high-traffic shopping cart system from a SQL table to Amazon DynamoDB to handle seasonal traffic spikes without manual scaling.
  • Columnar Transformation: Converting row-based data storage into column-based storage to optimize for large-scale analytical queries.
    • Example: Migrating historical sales records to Amazon Redshift to run complex year-over-year aggregate reports in seconds rather than hours.

Worked Examples

Scenario 1: The Fast Follower

Problem: A company wants to move their .NET application from SQL Server to AWS to save costs but has a limited development budget and cannot rewrite the application. Solution:

  1. Use AWS SCT to assess the schema.
  2. Migrate to Amazon Aurora PostgreSQL using Babelfish.
  3. Result: Babelfish allows the Aurora database to understand T-SQL (the SQL Server dialect), resulting in minimal code changes to the .NET application.

Scenario 2: The Scalability Wall

Problem: A gaming leaderboard application is failing because the relational database cannot handle the write-heavy load of millions of concurrent players. Solution:

  1. Map the SQL leaderboard table to an Amazon DynamoDB table.
  2. Rearchitect the application's data access layer to use the AWS SDK (CRUD operations) instead of SQL queries.
  3. Result: The application now has virtually infinite horizontal write scalability and consistent low latency.

Checkpoint Questions

  1. What is the primary difference in "Effort Required" between migrating to Aurora vs. migrating to DynamoDB?
  2. Which AWS tool is specifically used to extract data from a legacy database for loading into Amazon Redshift?
  3. When should a customer choose DocumentDB over DynamoDB?
  4. How does Amazon Aurora Serverless differ from traditional Amazon RDS?

[!NOTE] Answers to Checkpoints:

  1. Aurora typically requires minimal/minimal code changes (refactoring), while DynamoDB requires a complete code overhaul (rearchitecting).
  2. SCT Data Extractors.
  3. When the workload involves JSON document structures and requires compatibility with MongoDB APIs.
  4. Aurora Serverless scales capacity automatically based on demand, whereas traditional RDS requires manual instance sizing.

Muddy Points & Cross-Refs

  • Babelfish vs. SCT: Students often confuse these. SCT converts the actual source code files on your disk. Babelfish is a layer on the database itself that intercepts SQL Server commands at runtime.
  • DMS vs. SCT: SCT is for the "Brain" (schema/logic), DMS is for the "Body" (the actual data movement).
  • Deep Dive: For more on the specific NoSQL modeling required for DynamoDB, refer to the Advanced Data Modeling chapter.

Comparison Tables

AWS Purpose-Built Database Comparison

FeatureAuroraDynamoDBDocumentDBRedshift
ModelRelational (SQL)Key-ValueDocument (JSON)Columnar (OLAP)
ScalingVertical/Horizontal ReadSeamless HorizontalHorizontal ReadCluster Resizing
Typical Use CaseERP, CRM, FinanceMobile, AdTech, IoTContent Mgmt, CatalogsBI, Data Warehousing
Migration ToolDMS + SCT / BabelfishDMSDMSDMS + SCT Extractors
Schema FlexibilityRigidSchema-lessFlexibleRigid (Columnar)

Ready to study AWS Certified Solutions Architect - Professional (SAP-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free