Hands-On Lab845 words

Lab: Analyzing Model Performance with Amazon SageMaker Clarify

Analyze model performance

Lab: Analyzing Model Performance with Amazon SageMaker Clarify

This lab provides hands-on experience in evaluating machine learning model performance using Amazon SageMaker. You will focus on interpreting key metrics, detecting model bias, and understanding model behavior using SageMaker Clarify.

Prerequisites

  • An active AWS Account.
  • IAM Permissions: Administrator access or AmazonSageMakerFullAccess and AmazonS3FullAccess policies.
  • AWS CLI configured with your credentials.
  • Familiarity with Python and basic Machine Learning concepts (Precision, Recall, F1 Score).

Learning Objectives

  • Configure and run a SageMaker Clarify processing job to analyze model performance.
  • Interpret classification metrics including Confusion Matrices, F1 Score, and AUC-ROC.
  • Identify post-training bias across different data slices.
  • Evaluate model explainability using SHAP (Lundberg and Lee) values.

Architecture Overview

Loading Diagram...

Step-by-Step Instructions

Step 1: Prepare the S3 Environment

You need an S3 bucket to store the training data and the output from SageMaker Clarify.

bash
# Create a unique bucket name export BUCKET_NAME=brainybee-lab-ml-eval-<YOUR_ACCOUNT_ID> aws s3 mb s3://$BUCKET_NAME --region <YOUR_REGION>
Console alternative

Navigate to

S3
Create bucket

. Name it

brainybee-lab-ml-eval-[your-id]

and keep default settings.

Step 2: Configure the Model Performance Analysis

We will define a ModelConfig and AnalysisConfig for SageMaker Clarify. This configuration tells SageMaker which model to evaluate and which metrics to calculate.

[!NOTE] In a production scenario, you would point this to an existing Model Name in the SageMaker Model Registry.

bash
# Create the analysis configuration file (analysis_config.json) cat <<EOF > analysis_config.json { "methods": { "report": {"name": "report", "title": "Model Performance Report"}, "shap": {"num_samples": 100}, "post_training_bias": {"methods": "all"} }, "predictor": { "model_name": "your-xgboost-model", "instance_type": "ml.m5.xlarge", "initial_instance_count": 1 } } EOF

Step 3: Launch the Clarify Processing Job

Run the processing job to generate the evaluation metrics. This step calculates the Confusion Matrix and Precision-Recall curves.

bash
aws sagemaker create-processing-job \n --processing-job-name "clarify-perf-analysis-$(date +%s)" \n --role-arn "<YOUR_SAGEMAKER_EXECUTION_ROLE_ARN>" \n --processing-resources '{"ClusterConfig": {"InstanceCount": 1, "InstanceType": "ml.m5.xlarge", "VolumeSizeInGB": 20}}' \n --app-specification '{"ImageUri": "<CLARIFY_IMAGE_URI>"}'

[!TIP] The <CLARIFY_IMAGE_URI> varies by region. Check the AWS documentation for the specific URI for SageMaker Clarify in your region.

Checkpoints

  1. Job Status Check: Run aws sagemaker describe-processing-job --processing-job-name [your-job-name] and ensure ProcessingJobStatus is Completed.
  2. Artifact Verification: Navigate to your S3 bucket. You should see a folder named analysis_results containing report.pdf and analysis.json.

Concept Review

Key Metrics for Model Evaluation

MetricDefinitionBest Used For...
Accuracy$(TP + TN) / TotalBalanced datasets.
PrecisionTP / (TP + FP)Minimizing False Positives (e.g., Spam detection).
RecallTP / (TP + FN)$Minimizing False Negatives (e.g., Cancer diagnosis).
F1 Score$2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}$Imbalanced datasets; harmonic mean of P & R.

Visualizing the ROC Curve

The Receiver Operating Characteristic (ROC) curve plots the True Positive Rate (TPR) against the False Positive Rate (FPR).

Compiling TikZ diagram…
Running TeX engine…
This may take a few seconds

Troubleshooting

ErrorLikely CauseFix
AccessDeniedIAM role lacks S3 permissions.Attach AmazonS3FullAccess to the execution role.
ResourceLimitExceededToo many active instances.Check Service Quotas for ml.m5.xlarge processing jobs.
InvalidConfigSyntax error in JSON config.Use a JSON validator to ensure analysis_config.json is well-formed.

Stretch Challenge

Scenario: Your model is performing well on average, but you suspect it is underperforming for a specific demographic (e.g., users in a specific postal_code).

Task: Modify your analysis_config.json to include a group_variable under post_training_bias to calculate the Difference in Proportions of Labels (DPL) for that specific feature.

Cost Estimate

  • SageMaker Processing: $0.23 per hour (for ml.m5.xlarge in us-east-1).
  • S3 Storage: Negligible for this lab (< $0.01).
  • Total Estimated Cost: < $0.50 (if teardown is completed).

Clean-Up / Teardown

[!WARNING] Failure to delete S3 objects and processing configurations can lead to small recurring storage costs.

bash
# Delete the analysis configuration from S3 aws s3 rm s3://$BUCKET_NAME/analysis_results --recursive # Delete the bucket (only if empty) aws s3 rb s3://$BUCKET_NAME

Ensure you stop any SageMaker Studio kernels or Notebook Instances used to trigger these jobs.

Ready to study AWS Certified Machine Learning Engineer - Associate (MLA-C01)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free