Hands-On Lab: Navigating the ML Development Lifecycle and Governance on AWS
ML development lifecycle
Prerequisites
Before starting this lab, ensure you have the following in place to successfully navigate the Machine Learning (ML) development lifecycle and governance steps:
- AWS Account: Access to an AWS account with administrator or PowerUser access.
- AWS CLI: Installed and configured (
aws configure) with your access keys. - Permissions: IAM permissions to create S3 buckets and manage Amazon SageMaker resources.
- Prior Knowledge: Basic understanding of ML terminology (e.g., training, inference, classification) and JSON file structures.
Learning Objectives
By completing this lab, you will be able to:
- Map Business Goals to ML Services: Translate a business objective (e.g., churn reduction) into an AWS ML architecture.
- Establish Data Processing Foundations: Provision secure cloud storage for ML datasets using Amazon S3.
- Implement Model Governance: Create an ML Project tracking structure using the Amazon SageMaker Model Registry.
- Execute Lifecycle Approvals: Transition a model from a "Pending" state to an "Approved" state for production readiness.
Architecture Overview
The following diagram illustrates the lifecycle workflow you will build. We will simulate the data preparation phase, register a model, and execute a governance approval step.
Here is how this maps to the broader ML Development Lifecycle covered in the AWS Certified AI Practitioner framework:
Step-by-Step Instructions
[!NOTE] Scenario: Your company wants to reduce customer churn by 15% (Business Goal). You have framed this as a binary classification problem (ML Problem Framing). Now, you need to set up the data processing layer and govern the model lifecycle.
Step 1: Set Up the Data Collection Environment
The first technical step in data processing is collecting and integrating data. We will create an S3 bucket to store our raw and preprocessed historical patient/customer records.
# Define a unique bucket name using your account ID to ensure global uniqueness
export ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export BUCKET_NAME="brainybee-ml-data-${ACCOUNT_ID}"
# Create the S3 bucket
aws s3 mb s3://${BUCKET_NAME} --region us-east-1▶Console alternative
- Navigate to the S3 Console in AWS.
- Click Create bucket.
- Enter a globally unique bucket name (e.g.,
brainybee-ml-data-12345). - Leave default settings (Block all public access enabled) and click Create bucket.
📸 Screenshot: A successfully created S3 bucket in the AWS Management Console.
Step 2: Create a SageMaker Model Package Group
To manage the model lifecycle, we use the SageMaker Model Registry. This tracks model versions, documentation, and risk categories for audit readiness.
# Create a logical group to hold versions of our churn prediction model
aws sagemaker create-model-package-group \
--model-package-group-name "brainybee-churn-prediction" \
--model-package-group-description "Customer Churn Classification Models"▶Console alternative
- Navigate to the Amazon SageMaker Console.
- Under the left navigation pane, go to Models > Model registry.
- Click Create model package group.
- Name it
brainybee-churn-predictionand provide a brief description. - Click Create.
[!TIP] In a real-world MLOps workflow, this step is often automated via SageMaker Pipelines upon the approval of an ML use case.
Step 3: Register a Model Version (Development Phase)
Once a model is trained, it must be registered for governance. We will simulate registering a trained model by pointing to a dummy container image.
First, create a JSON configuration file for the model inference specifications:
cat <<EOF > inference_spec.json
{
"InferenceSpecification": {
"Containers": [
{
"Image": "683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-xgboost:1.2-1"
}
],
"SupportedContentTypes": ["text/csv"],
"SupportedResponseMIMETypes": ["text/csv"]
}
}
EOFNext, register the model with a "Pending" status to trigger the governance review process:
aws sagemaker create-model-package \
--model-package-group-name "brainybee-churn-prediction" \
--model-package-description "XGBoost Churn Model V1" \
--model-approval-status "PendingManualApproval" \
--cli-input-json file://inference_spec.json[!IMPORTANT] Take note of the
ModelPackageArnoutput in your terminal. You will need it for the next step.
Step 4: Execute Governance Approval
Governance teams review model documentation, performance metrics, and risk impact. Upon approval, the model package version is updated to reflect its readiness for production.
# Replace <YOUR_MODEL_PACKAGE_ARN> with the ARN output from Step 3
# Example ARN format: arn:aws:sagemaker:us-east-1:123456789012:model-package/brainybee-churn-prediction/1
aws sagemaker update-model-package \
--model-package-arn "<YOUR_MODEL_PACKAGE_ARN>" \
--model-approval-status "Approved"▶Console alternative
- In the SageMaker Console, navigate to Models > Model registry.
- Click on the
brainybee-churn-predictiongroup. - You will see Version 1 listed with a status of PendingManualApproval.
- Click on the version number.
- Select Update status, choose Approved, and add an optional comment (e.g., "Compliance checks passed").
- Click Save and update.
Checkpoints
Run these verifications to ensure your lab steps were completed successfully:
-
Verify Data Storage:
bashaws s3 ls | grep brainybee-ml-dataExpected output: Your bucket name should appear in the list.
-
Verify Model Registry Setup:
bashaws sagemaker list-model-package-groups --name-contains "brainybee"Expected output: JSON detailing the
brainybee-churn-predictionpackage group. -
Verify Governance Approval:
bashaws sagemaker describe-model-package --model-package-name "<YOUR_MODEL_PACKAGE_ARN>" --query "ModelApprovalStatus"Expected output:
"Approved"
Clean-Up / Teardown
[!WARNING] Remember to run the teardown commands to avoid ongoing charges and to keep your AWS environment clean. While the Model Registry itself doesn't incur significant hourly costs, it's a best practice to remove unused resources.
Execute the following commands to delete all provisioned resources:
# 1. Delete the Model Version
aws sagemaker delete-model-package --model-package-name "<YOUR_MODEL_PACKAGE_ARN>"
# 2. Delete the Model Package Group
aws sagemaker delete-model-package-group --model-package-group-name "brainybee-churn-prediction"
# 3. Delete the local JSON file
rm inference_spec.json
# 4. Delete the S3 bucket
aws s3 rb s3://${BUCKET_NAME} --forceTroubleshooting
| Common Error | Cause | Fix |
|---|---|---|
BucketAlreadyExists | S3 bucket names must be globally unique. | Ensure you appended your Account ID or random numbers to the bucket name. |
AccessDeniedException | IAM user lacks permissions for SageMaker or S3. | Attach the AmazonSageMakerFullAccess and AmazonS3FullAccess policies to your IAM user/role. |
ValidationException: Could not find model package | Incorrect or malformed ARN used in Step 4. | Copy the exact ARN string from the output of Step 3, ensuring no trailing spaces. |
ResourceNotFound | Region mismatch between CLI config and requested resource. | Append --region us-east-1 (or your chosen region) to the AWS CLI commands. |
Concept Review
This lab walked you through bridging the gap between theoretical ML planning and technical AWS execution.
| ML Lifecycle Phase | AWS Service Used | Purpose in this Lab |
|---|---|---|
| Data Processing | Amazon S3 | Providing a secure, scalable landing zone for raw data and features before training begins. |
| Model Development | SageMaker Model Registry | Organizing iterations of built models. Keeping track of metadata, intended audience, and risk categories. |
| Deployment / Governance | SageMaker Model Registry (Status Update) | Enforcing a human-in-the-loop review process to ensure models meet regulatory and ethical compliance before production deployment. |