Hands-On Lab863 words

Hands-On Lab: Navigating the AWS ML Development Lifecycle

ML development lifecycle

Hands-On Lab: Navigating the AWS ML Development Lifecycle

Welcome to the ML Development Lifecycle lab. In this guided exercise, we will bridge the gap between machine learning theory and practical AWS execution. You will simulate the core phases of the ML lifecycle—Data Processing, Model Development, and Model Governance—using Amazon S3 and Amazon SageMaker.

Prerequisites

Before starting this lab, ensure you have the following:

  • An active AWS Account with administrator or power-user privileges.
  • The AWS CLI (aws) installed and configured with your credentials.
  • Basic familiarity with terminal/command-line operations.
  • An existing IAM Role with the AmazonSageMakerFullAccess and AmazonS3FullAccess policies attached. Have the Role ARN ready: <YOUR_SAGEMAKER_ROLE_ARN>.

Learning Objectives

By completing this lab, you will be able to:

  1. Provision secure storage for the Data Processing phase of the ML lifecycle.
  2. Deploy a managed Jupyter environment for Model Development and Exploratory Data Analysis (EDA).
  3. Establish governance and tracking by registering a model group in the SageMaker Model Registry.

Architecture Overview

The following diagrams illustrate both the conceptual ML lifecycle and the specific AWS architecture we are building today.

Conceptual ML Lifecycle

Loading Diagram...

Lab AWS Architecture

Loading Diagram...

Step-by-Step Instructions

Step 1: Create an S3 Bucket for Data Processing

Every ML project requires robust data collection and integration. In this step, you will create an S3 bucket to store your raw and transformed datasets (e.g., historical patient records or customer churn data).

bash
aws s3 mb s3://brainybee-ml-data-<YOUR_ACCOUNT_ID>-<YOUR_REGION> --region <YOUR_REGION>

[!TIP] S3 bucket names must be globally unique. Using your Account ID and Region ensures your bucket name won't conflict with others.

Console alternative
  1. Navigate to the S3 Console.
  2. Click Create bucket.
  3. Enter the bucket name brainybee-ml-data-<YOUR_ACCOUNT_ID>-<YOUR_REGION>.
  4. Leave all other settings as default and click Create bucket.

📸 Screenshot: S3 Create Bucket Form

Step 2: Provision a Model Development Environment

Once data is staged, data scientists need an environment for feature engineering and model training. We will launch a managed SageMaker Notebook Instance.

bash
aws sagemaker create-notebook-instance \ --notebook-instance-name brainybee-ml-dev-env \ --instance-type ml.t3.medium \ --role-arn <YOUR_SAGEMAKER_ROLE_ARN>

[!IMPORTANT] Ensure you replace <YOUR_SAGEMAKER_ROLE_ARN> with the actual ARN of your IAM role (e.g., arn:aws:iam::123456789012:role/SageMakerExecutionRole).

Console alternative
  1. Navigate to the Amazon SageMaker Console.
  2. On the left sidebar, under Applications and IDEs, select Notebook instances.
  3. Click Create notebook instance.
  4. Name it brainybee-ml-dev-env and select ml.t3.medium for the instance type.
  5. Under Permissions and encryption, select your existing SageMaker execution role.
  6. Click Create notebook instance.

📸 Screenshot: SageMaker Notebook Instance Configuration

Step 3: Establish Model Governance via Registry

Before a model hits production, it must be approved for compliance, ethics, and performance. SageMaker Model Registry helps manage model versioning, capture lineage, and document purpose.

bash
aws sagemaker create-model-package-group \ --model-package-group-name "brainybee-churn-prediction" \ --model-package-group-description "Lifecycle Lab: Model package group for customer churn ML project"

[!TIP] Governance is critical for audit readiness. The model registry ensures you don't deploy an unapproved or biased model into your production systems.

Console alternative
  1. Navigate to the Amazon SageMaker Console.
  2. On the left sidebar, under Models, select Model registry.
  3. Click Create model group.
  4. Enter the name brainybee-churn-prediction and a brief description.
  5. Click Create model group.

📸 Screenshot: SageMaker Model Registry Group Creation

Checkpoints

Verify that your resources have been provisioned correctly.

Checkpoint 1: Verify S3 Data Storage Run the following command to ensure your bucket exists:

bash
aws s3 ls | grep brainybee-ml-data

Expected Output: You should see your bucket name listed with a timestamp.

Checkpoint 2: Verify SageMaker Notebook Status Run the following command to check if your notebook is ready for use:

bash
aws sagemaker describe-notebook-instance --notebook-instance-name brainybee-ml-dev-env --query "NotebookInstanceStatus"

Expected Output: "InService" (If it says "Pending", wait 2-3 minutes and try again).

Checkpoint 3: Verify Model Registry Group List your model package groups to ensure the governance structure is in place:

bash
aws sagemaker list-model-package-groups --name-contains "brainybee"

Expected Output: A JSON array containing the ModelPackageGroupArn for your newly created group.

Teardown

[!WARNING] Cost Warning: Remember to run the following teardown commands to avoid ongoing charges. SageMaker Notebook instances bill per hour while they are in the InService state.

Execute these commands in order to clean up your AWS environment:

  1. Delete the SageMaker Notebook Instance
bash
aws sagemaker stop-notebook-instance --notebook-instance-name brainybee-ml-dev-env # Wait ~2 minutes for the status to change to Stopped before running the delete command aws sagemaker delete-notebook-instance --notebook-instance-name brainybee-ml-dev-env
  1. Delete the Model Package Group
bash
aws sagemaker delete-model-package-group --model-package-group-name brainybee-churn-prediction
  1. Empty and Delete the S3 Bucket
bash
aws s3 rm s3://brainybee-ml-data-<YOUR_ACCOUNT_ID>-<YOUR_REGION> --recursive aws s3 rb s3://brainybee-ml-data-<YOUR_ACCOUNT_ID>-<YOUR_REGION>

Troubleshooting

Common Error / IssueProbable CauseFix / Solution
BucketNameAlreadyExistsS3 bucket names must be globally unique across all AWS accounts.Append your unique AWS Account ID and region to the bucket name.
ValidationException: RoleArn is invalidThe IAM role ARN provided does not exist or has a typo.Verify your role ARN in the IAM Console and ensure it includes arn:aws:iam::...
InvalidParameterException on DeleteTrying to delete a notebook instance while it is still running.Run stop-notebook-instance first, wait for the status to say Stopped, then run delete-notebook-instance.
Command hangs or timeoutsAWS CLI is not configured correctly or lacks internet access.Run aws configure to verify your credentials and default region.

Ready to study AWS Certified AI Practitioner (AIF-C01)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free