AWS Lab: Choosing the Optimal ML Modeling Approach
Choose a modeling approach
AWS Lab: Choosing the Optimal ML Modeling Approach
In this lab, you will navigate the decision-making process for selecting the right machine learning approach on AWS. You will explore the three tiers of AWS ML services: AI Services (pre-built), ML Services (SageMaker built-in algorithms), and ML Frameworks (custom scripts), determining which fits a specific business use case based on data type and required customization.
[!WARNING] Remember to run the teardown commands at the end of this lab to avoid ongoing charges for SageMaker resources.
Prerequisites
- An active AWS Account.
- IAM user with
AmazonSageMakerFullAccess,AmazonRekognitionFullAccess, andCloudWatchLogsFullAccess. - AWS CLI installed and configured with your credentials.
- Basic familiarity with Python and the AWS Management Console.
- A target region (e.g.,
us-east-1).
Learning Objectives
- Categorize AWS ML offerings into the Three Tiers of Customization.
- Invoke an AI Service (Amazon Rekognition) for rapid prototyping without model training.
- Configure a SageMaker Built-in Algorithm (XGBoost) for structured data classification.
- Evaluate which modeling approach satisfies business requirements for latency, cost, and accuracy.
Architecture Overview
This lab demonstrates two paths: the "No-Code/Pre-trained" path via AI Services and the "Managed Algorithm" path via SageMaker.
Step-by-Step Instructions
Step 1: Analyze the Scenario and Select Tier
We have two tasks:
- Task A: Identify objects in images for a security app (Rapid deployment needed).
- Task B: Predict customer churn based on CSV files (Custom logic required).
Decision: We will use Amazon Rekognition for Task A and SageMaker XGBoost for Task B.
Step 2: Use an AI Service for Task A (Image Analysis)
For Task A, we utilize a pre-trained model. No training code is required.
# Identify labels in a sample image (replace with your bucket/key or use a local file)
aws rekognition detect-labels \
--image '{"S3Object":{"Bucket":"brainybee-public-assets","Name":"security-sample.jpg"}}' \
--region <YOUR_REGION>▶Console alternative
- Search for Rekognition in the AWS Console.
- In the left sidebar, click Label Detection.
- Upload an image of a street or office.
- View the "Results" panel for detected objects and confidence scores.
Step 3: Prepare SageMaker Studio for Task B
For tabular data prediction, we need a managed environment.
- Navigate to Amazon SageMaker > Domains.
- Click Create Domain > Quick setup.
- Once the status is "InService", launch Studio.
Step 4: Configure a SageMaker Built-in Algorithm
We will use the SageMaker Python SDK to point to the XGBoost container image. This avoids writing custom model architecture code.
import sagemaker
from sagemaker import image_uris
region = sagemaker.Session().boto_region_name
# Get the URI for the built-in XGBoost container
container = image_uris.retrieve("xgboost", region, version="1.5-1")
print(f"Target Container: {container}")[!TIP] Using
image_uris.retrieveensures you are using a Docker image optimized for AWS infrastructure.
Step 5: Define the Modeling Strategy
Use the following TikZ diagram to visualize where these choices sit on the complexity vs. effort scale:
\begin{tikzpicture}[scale=0.8] \draw[->, thick] (0,0) -- (8,0) node[right] {Customization}; \draw[->, thick] (0,0) -- (0,6) node[above] {Development Effort};
% AI Services
\filldraw[blue!20, draw=blue!50] (0.5,0.5) rectangle (2.5,1.5) node[midway, black] {AI Services};
% Built-in Algos
\filldraw[green!20, draw=green!50] (3,2.5) rectangle (5,3.5) node[midway, black] {Built-in};
% Custom Frameworks
\filldraw[orange!20, draw=orange!50] (5.5,4.5) rectangle (7.5,5.5) node[midway, black] {Frameworks};
\node at (1.5, -0.5) {\scriptsize Low};
\node at (6.5, -0.5) {\scriptsize High};\end{tikzpicture}
Checkpoints
| Step | Action | Expected Result |
|---|---|---|
| 1 | Rekognition CLI | A JSON response containing labels like "Person" or "Building" with confidence scores > 80%. |
| 2 | SageMaker Studio | The domain status shows a green "InService" checkmark. |
| 3 | Image URI | The notebook prints a string like 683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-xgboost:1.5-1. |
Concept Review
| Feature | AI Services (Rekognition/Comprehend) | SageMaker Built-in Algorithms | Custom Frameworks (Script Mode) |
|---|---|---|---|
| Expertise | Low (App Developer) | Medium (Data Scientist) | High (ML Engineer) |
| Model Control | None (Black box) | High (Hyperparameters) | Total (Full Architecture) |
| Training Data | Not required | Required | Required |
| Use Case | Common (OCR, Sentiment) | Standard (Regression, XGBoost) | Research/Specialized |
Troubleshooting
| Error | Cause | Fix |
|---|---|---|
AccessDeniedException | Missing IAM permissions | Add AmazonRekognitionFullAccess to your IAM user/role. |
ResourceLimitExceeded | SageMaker quota reached | Check Service Quotas in the console; ensure no old Studio apps are running. |
InvalidImageUri | Incorrect region or version | Ensure the region in your Python script matches your CLI configuration. |
Challenge
Scenario: A client wants to detect if customers are happy or sad in the security photos from Task A. Question: Should you use a Custom SageMaker model or an AI Service?
▶Show solution
Use Amazon Rekognition (Face Analysis). It provides sentiment/emotion detection out-of-the-box, saving the cost and time of labeling a custom dataset of emotional faces.
Cost Estimate
- Amazon Rekognition: First 5,000 images per month are Free Tier eligible. Otherwise, ~$0.001 per image.
- SageMaker Studio: ~$0.05/hour for
ml.t3.medium(Free Tier covers 250 hours/month for first 2 months). - S3 Storage: Negligible for small lab datasets (<$0.01).
Teardown
[!IMPORTANT] Failure to delete the SageMaker Domain App can result in unexpected hourly charges.
# 1. List and stop any running SageMaker Studio apps
aws sagemaker list-apps --domain-id <YOUR_DOMAIN_ID>
# 2. Delete the app (example for a JupyterServer app)
aws sagemaker delete-app \
--domain-id <YOUR_DOMAIN_ID> \
--app-type JupyterServer \
--app-name default \
--user-profile-name <YOUR_USER_NAME>
# 3. Delete any S3 buckets created during the lab
aws s3 rb s3://brainybee-lab-data --force