Study Guide1,085 words

Study Guide: CI/CD Pipelines and ML Orchestration (MLA-C01)

Use automated orchestration tools to set up continuous integration and continuous delivery (CI/CD) pipelines

Study Guide: CI/CD Pipelines and ML Orchestration

This guide covers Task 3.3 of the AWS Certified Machine Learning Engineer – Associate (MLA-C01) exam. It focuses on automating the lifecycle of machine learning models through robust CI/CD practices and orchestration tools.


Learning Objectives

After studying this guide, you should be able to:

  • Differentiate between AWS CodePipeline, CodeBuild, and CodeDeploy capabilities.
  • Select the appropriate orchestration tool (SageMaker Pipelines, Step Functions, or MWAA) for a specific ML workflow.
  • Implement deployment strategies such as Blue/Green and Canary for ML endpoints.
  • Configure automated retraining triggers using Amazon EventBridge and SageMaker.
  • Apply Infrastructure as Code (IaC) principles using AWS CDK and CloudFormation to ML environments.

Key Terms & Glossary

  • CI/CD: Continuous Integration (automated building/testing) and Continuous Delivery (automated release to repository) or Deployment (automated release to production).
  • MLOps: The integration of DevOps principles into ML to ensure consistency, reproducibility, and reliability.
  • DAG (Directed Acyclic Graph): A collection of all tasks you want to run, organized in a way that reflects their relationships and dependencies (used heavily in Apache Airflow).
  • Blue/Green Deployment: A strategy where you have two identical production environments; only one (Blue) serves traffic while the other (Green) is updated and tested.
  • Infrastructure as Code (IaC): Managing and provisioning infrastructure through machine-readable definition files (e.g., YAML, JSON, or Python via CDK) rather than manual console configuration.

The "Big Idea"

In traditional software, CI/CD focuses on code. In MLOps, CI/CD must account for a "triad" of changes: Code, Data, and Models. A change in any one of these three must trigger the pipeline to ensure the deployed model remains accurate and secure. Orchestration is the "glue" that ensures these complex, multi-step processes (data ingestion → preprocessing → training → evaluation → deployment) happen automatically and reliably.


Formula / Concept Box

Pipeline PhasePrimary AWS ToolCore Responsibility
SourceAWS CodeCommit / GitHubVersion control for code and configuration (IaC).
BuildAWS CodeBuildCompiling code, running unit tests, and building Docker images.
OrchestrateSageMaker PipelinesManaging the ML-specific workflow steps (Training, Tuning).
DeployAWS CodeDeployExecuting deployment strategies (Canary, Blue/Green) to endpoints.
TriggerAmazon EventBridgeScheduling or event-based execution (e.g., on S3 data upload).

Hierarchical Outline

  1. Version Control & Repository Management
    • Gitflow/GitHub Flow: Branching strategies for managing feature development and production releases.
    • AWS CodeCommit: Managed source control service.
  2. AWS Developer Tools for CI/CD
    • CodePipeline: The visual workflow manager that connects source, build, and deploy stages.
    • CodeBuild: Serverless build service that scales to handle multiple builds concurrently.
    • CodeDeploy: Automates code deployments to EC2, Lambda, or ECS/Fargate.
  3. ML Workflow Orchestrators
    • SageMaker Pipelines: Purpose-built for ML; includes a Model Registry and lineage tracking.
    • AWS Step Functions: Serverless state machines for general-purpose orchestration; great for cross-service coordination.
    • Amazon MWAA: Managed Airflow for teams requiring Python-based DAG flexibility.
  4. Deployment & Rollback Strategies
    • Blue/Green: Full swap of traffic between environments.
    • Canary: Small percentage of traffic is shifted to the new model first to monitor for errors.
    • Linear: Traffic is shifted in equal increments over a set period.

Visual Anchors

ML CI/CD Pipeline Flow

Loading Diagram...

Blue/Green Traffic Shifting

\begin{tikzpicture} % Blue Environment \draw[blue, thick] (0,0) rectangle (2,1) node[midway] {Blue (V1)}; % Green Environment \draw[green!60!black, thick] (4,0) rectangle (6,1) node[midway] {Green (V2)}; % Traffic Splitter \draw (3,2.5) circle (0.3cm) node {\small Router}; % Traffic Arrows \draw[->, thick] (3,2.2) -- (1,1) node[midway, left] {10%}; \draw[->, thick] (3,2.2) -- (5,1) node[midway, right] {90%}; % Annotation \node at (3,-1) {\small Canary Deployment: Shifting traffic from Blue to Green}; \end{tikzpicture}


Definition-Example Pairs

  • Continuous Deployment: Automatically deploying every change that passes the pipeline directly to production.
    • Example: A retail site automatically updating its "Recommended for You" model every time a new training dataset is uploaded to S3 and performance exceeds the threshold.
  • Model Registry: A central repository for managing model versions and their metadata.
    • Example: A data scientist marks Version 4 of a "Churn Prediction" model as "Approved" in the SageMaker Model Registry, which automatically triggers a deployment to the staging environment.
  • Automated Retraining: Using events to trigger a new training job.
    • Example: An EventBridge rule detects a large drop in model accuracy via SageMaker Model Monitor and triggers a SageMaker Pipeline to retrain the model on the most recent 30 days of data.

Worked Examples

Scenario: Triggering an ML Pipeline on Data Arrival

Goal: Set up an automated workflow that starts a SageMaker Pipeline whenever a new CSV file is uploaded to an S3 bucket.

  1. Configure S3 Event Notifications: Set the S3 bucket to send a message to Amazon EventBridge when an ObjectCreated event occurs.
  2. Define EventBridge Rule: Create a rule that filters for the specific bucket and file suffix (e.g., .csv).
  3. Set Target: Set the target of the EventBridge rule to be the SageMaker Pipeline ARN.
  4. Verification: Upload a test file. Check the SageMaker console under "Pipelines" to confirm a new execution has started.

[!TIP] Always use IAM roles with the "Principle of Least Privilege." The EventBridge service needs sagemaker:StartPipelineExecution permissions for the specific pipeline.


Checkpoint Questions

  1. Which service is best for orchestrating a workflow that involves AWS Lambda, Amazon Glue, and SageMaker in a serverless state machine?
  2. What is the difference between a Canary deployment and a Blue/Green deployment?
  3. In AWS CodePipeline, what is the role of the "Artifact Store" (usually an S3 bucket)?
  4. How does AWS CDK differ from AWS CloudFormation for defining infrastructure?
Click to see answers
  1. AWS Step Functions.
  2. Blue/Green swaps all traffic (or most) at once after the green environment is ready. Canary shifts a tiny fraction (e.g., 5%) first to test "in the wild" before proceeding.
  3. It stores the output files (code, build results) from one stage so they can be used by the next stage.
  4. CloudFormation uses static YAML/JSON templates. CDK allows you to define infrastructure using familiar programming languages like Python or TypeScript, which then synthesizes into CloudFormation templates.

Muddy Points & Cross-Refs

  • Step Functions vs. SageMaker Pipelines: Use SageMaker Pipelines if your workflow is 100% focused on ML steps and you want built-in model lineage. Use Step Functions if you need to coordinate non-ML services (like calling an external API or running complex Lambda logic) as part of the flow.
  • CodeBuild vs. SageMaker Training Jobs: CodeBuild is for building software/images and running tests. SageMaker Training Jobs are for intensive mathematical model training on specialized GPU/CPU instances.
  • Rollbacks: Always ensure your pipeline includes a "Manual Approval" gate before production if you aren't fully confident in your automated integration tests.

Comparison Tables

Orchestration Tools Comparison

FeatureSageMaker PipelinesAWS Step FunctionsAmazon MWAA (Airflow)
FocusNative ML WorkflowsGeneral App IntegrationComplex Data Engineering
LanguagePython (SageMaker SDK)Amazon States Lang (JSON)Python (DAGs)
VisualizerSageMaker StudioStep Functions ConsoleAirflow UI
Best ForStandardized ML LifecycleEvent-driven microservicesMulti-cloud or complex dependencies

Deployment Strategies

StrategyDowntimeRiskImplementation Complexity
All-at-onceHighHighLow
Blue/GreenZeroLowMedium
CanaryZeroLowestHigh

Ready to study AWS Certified Machine Learning Engineer - Associate (MLA-C01)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free