AWS Developer Tools: Mastering CodeBuild, CodeDeploy, and CodePipeline for ML
Configuring and troubleshooting CodeBuild, CodeDeploy, and CodePipeline, including stages
AWS Developer Tools: Mastering CodeBuild, CodeDeploy, and CodePipeline for ML
This guide covers the core AWS CI/CD services—CodeBuild, CodeDeploy, and CodePipeline—specifically tailored for the AWS Certified Machine Learning Engineer Associate (MLA-C01) exam. We will focus on configuration, orchestration, and troubleshooting across the pipeline stages.
Learning Objectives
After studying this guide, you should be able to:
- Configure AWS CodePipeline to automate ML model training and deployment workflows.
- Troubleshoot common failures in CodeBuild environments and CodeDeploy deployment groups.
- Differentiate between deployment strategies such as Blue/Green, Canary, and Linear.
- Apply build and deployment specifications (
buildspec.ymlandappspec.yml) correctly.
Key Terms & Glossary
- Artifact: A file or collection of files (like a Docker image or a serialized ML model) produced by one pipeline stage to be used by another.
- Buildspec: A YAML file used by CodeBuild to define commands and settings for the build phase.
- Appspec: A YAML file used by CodeDeploy to manage the lifecycle hooks of a deployment.
- CI/CD: Continuous Integration (automating code merges and builds) and Continuous Delivery/Deployment (automating the release process).
- Stage: A logical grouping of actions in CodePipeline (e.g., Source, Build, Production-Deploy).
The "Big Idea"
In Machine Learning, CI/CD is more than just shipping code; it is about MLOps. The "Big Idea" is to treat your model training and infrastructure as code. By orchestrating CodeBuild (to train/package) and CodeDeploy (to update endpoints) within CodePipeline, you ensure that every model update is tested, validated, and deployed with minimal manual intervention and maximum reliability.
Formula / Concept Box
| Service | Core Responsibility | Primary Config File | Key Outcome |
|---|---|---|---|
| AWS CodePipeline | Orchestration & Workflow | Pipeline JSON / CDK | Connects stages together |
| AWS CodeBuild | Compile, Test, Package | buildspec.yml | Produces an Artifact (e.g., .tar.gz) |
| AWS CodeDeploy | Deployment & Rollbacks | appspec.yml | Updates the live environment |
Hierarchical Outline
- I. AWS CodePipeline (The Orchestrator)
- Source Stage: Connects to CodeCommit, GitHub, or S3.
- Build Stage: Invokes CodeBuild to run tests or containerize models.
- Deploy Stage: Invokes CodeDeploy or SageMaker to release the model.
- Transitions: Can be disabled to stop the flow between stages for troubleshooting.
- II. AWS CodeBuild (The Worker)
- Phases:
install,pre_build,build,post_build. - Environment: Managed Docker containers; requires an IAM Service Role to access S3/ECR.
- Logs: Always check CloudWatch Logs for build failures.
- Phases:
- III. AWS CodeDeploy (The Releaser)
- Deployment Groups: Defines where and how to deploy (EC2, Lambda, ECS).
- Strategies: Blue/Green (entirely new fleet) vs. In-place.
- Traffic Shifting: Canary (small % first) vs. Linear (equal increments).
Visual Anchors
Pipeline Flowchart
Deployment Traffic Shifting (Canary vs. Linear)
\begin{tikzpicture}[scale=0.8] % Linear Shifting \draw[->, thick] (0,0) -- (5,0) node[right] {Time}; \draw[->, thick] (0,0) -- (0,3) node[above] {Traffic %}; \draw[blue, thick] (0,0) -- (1,0.5) -- (2,1) -- (3,1.5) -- (4,2) -- (5,2.5); \node at (2.5,-0.5) {Linear (Steady Increments)};
% Canary Shifting \draw[->, thick] (7,0) -- (12,0) node[right] {Time}; \draw[->, thick] (7,0) -- (7,3) node[above] {Traffic %}; \draw[red, thick] (7,0) -- (8,0.5) -- (10,0.5) -- (11,2.5); \node at (9.5,-0.5) {Canary (Small test, then jump)}; \end{tikzpicture}
Definition-Example Pairs
- Build Phase Hooks: Specific points in the CodeBuild lifecycle where commands run.
- Example: Using the
pre_buildphase to log into Amazon ECR before pushing a training image in thebuildphase.
- Example: Using the
- Deployment Lifecycle Hooks: Scripts that run at specific stages of a deployment to verify health.
- Example: A
BeforeAllowTraffichook in Lambda that runs a test function to ensure the new model version responds correctly before shifting users to it.
- Example: A
Worked Examples
Scenario: Troubleshooting a "Build Failed" Error
Problem: A CodePipeline fails at the CodeBuild stage. The error message in the console says "Execution failed in pre_build phase."
Step-by-Step Breakdown:
- Locate Logs: Navigate to the CodeBuild console, select the failed build, and click "View Logs" to open CloudWatch.
- Identify Syntax Error: The log shows
buildspec.yml: line 12: unexpected EOF. - Fix Config: Correct the indentation in the
buildspec.ymlfile in the source repository. - Check Permissions: If the log says
AccessDenied, ensure the CodeBuild Service Role hass3:GetObjectpermissions for the input artifact bucket. - Re-run: Commit the fix to trigger the pipeline again.
Checkpoint Questions
- Which file is required by CodeDeploy to manage the deployment of an application to AWS Lambda?
- In CodePipeline, what is the purpose of disabling a "Transition" between two stages?
- If a CodeBuild project needs to access an encrypted S3 bucket, what two things must be configured?
- What is the main difference between a "Canary" deployment and a "Linear" deployment?
▶Click to see answers
appspec.yml- To prevent the pipeline from automatically progressing to the next stage, allowing for manual inspection or emergency stopping of deployments.
- The CodeBuild IAM Service Role must have S3 permissions, and the KMS Key Policy must allow the role to decrypt the data.
- Canary shifts a small percentage and waits a set time before shifting the rest; Linear shifts traffic in equal increments over several intervals (e.g., 10% every 10 minutes).
Muddy Points & Cross-Refs
- Buildspec vs. Appspec: It is easy to flip these. Remember: Buildspec is for building (CodeBuild), Appspec is for applying/deploying (CodeDeploy).
- Service Roles: Each service (Build, Deploy, Pipeline) has its own IAM role. If a pipeline can't trigger a build, check the Pipeline role. If a build can't upload to S3, check the Build role.
- SageMaker Integration: For the exam, know that CodePipeline can trigger SageMaker Pipelines directly using a dedicated action type.
Comparison Tables
| Feature | CodeBuild | CodeDeploy |
|---|---|---|
| Primary Goal | Compilation / Testing | Distribution / Updating |
| Artifacts | Outputs artifacts to S3 | Consumes artifacts from S3 |
| Compute | Ephemeral Docker containers | EC2, Lambda, ECS, On-prem |
| Strategy Focus | Build speed / environment | Availability / Rollback safety |
| File Name | buildspec.yml | appspec.yml |