BrainyBeeBrainyBee
ExploreBlogStart Studying
HomeAWS Certified Machine Learning Engineer - Associate (MLA-C01)Lab: Analyzing Model Performance with Amazon SageMaker Clarify
Hands-On Lab845 words

Lab: Analyzing Model Performance with Amazon SageMaker Clarify

Analyze model performance

Lab: Analyzing Model Performance with Amazon SageMaker Clarify

This lab provides hands-on experience in evaluating machine learning model performance using Amazon SageMaker. You will focus on interpreting key metrics, detecting model bias, and understanding model behavior using SageMaker Clarify.

Prerequisites

  • An active AWS Account.
  • IAM Permissions: Administrator access or AmazonSageMakerFullAccess and AmazonS3FullAccess policies.
  • AWS CLI configured with your credentials.
  • Familiarity with Python and basic Machine Learning concepts (Precision, Recall, F1 Score).

Learning Objectives

  • Configure and run a SageMaker Clarify processing job to analyze model performance.
  • Interpret classification metrics including Confusion Matrices, F1 Score, and AUC-ROC.
  • Identify post-training bias across different data slices.
  • Evaluate model explainability using SHAP (Lundberg and Lee) values.

Architecture Overview

Loading Diagram...

Step-by-Step Instructions

Step 1: Prepare the S3 Environment

You need an S3 bucket to store the training data and the output from SageMaker Clarify.

bash
# Create a unique bucket name export BUCKET_NAME=brainybee-lab-ml-eval-<YOUR_ACCOUNT_ID> aws s3 mb s3://$BUCKET_NAME --region <YOUR_REGION>
▶Console alternative

Navigate to

S3
Create bucket

. Name it

brainybee-lab-ml-eval-[your-id]

and keep default settings.

Step 2: Configure the Model Performance Analysis

We will define a ModelConfig and AnalysisConfig for SageMaker Clarify. This configuration tells SageMaker which model to evaluate and which metrics to calculate.

[!NOTE] In a production scenario, you would point this to an existing Model Name in the SageMaker Model Registry.

bash
# Create the analysis configuration file (analysis_config.json) cat <<EOF > analysis_config.json { "methods": { "report": {"name": "report", "title": "Model Performance Report"}, "shap": {"num_samples": 100}, "post_training_bias": {"methods": "all"} }, "predictor": { "model_name": "your-xgboost-model", "instance_type": "ml.m5.xlarge", "initial_instance_count": 1 } } EOF

Step 3: Launch the Clarify Processing Job

Run the processing job to generate the evaluation metrics. This step calculates the Confusion Matrix and Precision-Recall curves.

bash
aws sagemaker create-processing-job \n --processing-job-name "clarify-perf-analysis-$(date +%s)" \n --role-arn "<YOUR_SAGEMAKER_EXECUTION_ROLE_ARN>" \n --processing-resources '{"ClusterConfig": {"InstanceCount": 1, "InstanceType": "ml.m5.xlarge", "VolumeSizeInGB": 20}}' \n --app-specification '{"ImageUri": "<CLARIFY_IMAGE_URI>"}'

[!TIP] The <CLARIFY_IMAGE_URI> varies by region. Check the AWS documentation for the specific URI for SageMaker Clarify in your region.

Checkpoints

  1. Job Status Check: Run aws sagemaker describe-processing-job --processing-job-name [your-job-name] and ensure ProcessingJobStatus is Completed.
  2. Artifact Verification: Navigate to your S3 bucket. You should see a folder named analysis_results containing report.pdf and analysis.json.

Concept Review

Key Metrics for Model Evaluation

MetricDefinitionBest Used For...
Accuracy(TP+TN)/Total(TP + TN) / Total(TP+TN)/TotalBalanced datasets.
PrecisionTP/(TP+FP)TP / (TP + FP)TP/(TP+FP)Minimizing False Positives (e.g., Spam detection).
RecallTP/(TP+FN)TP / (TP + FN)TP/(TP+FN)Minimizing False Negatives (e.g., Cancer diagnosis).
F1 Score$$2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}$$Imbalanced datasets; harmonic mean of P & R.

Visualizing the ROC Curve

The Receiver Operating Characteristic (ROC) curve plots the True Positive Rate (TPR) against the False Positive Rate (FPR).

Compiling TikZ diagram…
⏳
Running TeX engine…
This may take a few seconds

Troubleshooting

ErrorLikely CauseFix
AccessDeniedIAM role lacks S3 permissions.Attach AmazonS3FullAccess to the execution role.
ResourceLimitExceededToo many active instances.Check Service Quotas for ml.m5.xlarge processing jobs.
InvalidConfigSyntax error in JSON config.Use a JSON validator to ensure analysis_config.json is well-formed.

Stretch Challenge

Scenario: Your model is performing well on average, but you suspect it is underperforming for a specific demographic (e.g., users in a specific postal_code).

Task: Modify your analysis_config.json to include a group_variable under post_training_bias to calculate the Difference in Proportions of Labels (DPL) for that specific feature.

Cost Estimate

  • SageMaker Processing: $0.23 per hour (for ml.m5.xlarge in us-east-1).
  • S3 Storage: Negligible for this lab (< $0.01).
  • Total Estimated Cost: < $0.50 (if teardown is completed).

Clean-Up / Teardown

[!WARNING] Failure to delete S3 objects and processing configurations can lead to small recurring storage costs.

bash
# Delete the analysis configuration from S3 aws s3 rm s3://$BUCKET_NAME/analysis_results --recursive # Delete the bucket (only if empty) aws s3 rb s3://$BUCKET_NAME

Ensure you stop any SageMaker Studio kernels or Notebook Instances used to trigger these jobs.

All AWS Certified Machine Learning Engineer - Associate (MLA-C01) Study Resources

Related Notes

  • Mastering Model Performance Analysis (AWS MLA-C01)1,145 words
  • Amazon SageMaker AI Built-In Algorithms: Selection and Application Guide925 words
  • Scalable and Cost-Effective ML Solutions on AWS890 words
  • Continuous Deployment Flow Structures & Pipeline Invocation920 words
  • Machine Learning Feasibility: Data Assessment and Problem Complexity945 words
  • Tradeoffs in Machine Learning: Performance, Time, and Cost925 words
  • Automating Compute Provisioning: AWS CloudFormation and AWS CDK925 words
  • Automation and Integration of Data Ingestion with Orchestration Services875 words
  • AWS Deployment Services and Amazon SageMaker AI Study Guide925 words
  • AWS Storage Solutions for Machine Learning: Use Cases and Trade-offs920 words
  • Mastering Regularization: L1, L2, and Dropout for Model Generalization945 words
  • Retraining Mechanisms: Building and Integrating Automated ML Pipelines945 words

Ready to study AWS Certified Machine Learning Engineer - Associate (MLA-C01)?

Practice tests, flashcards, and all study notes — free, no sign-up.

Start Studying

Ready to study AWS Certified Machine Learning Engineer - Associate (MLA-C01)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free
AWS Certified Machine Learning Engineer - Associate (MLA-C01) ResourcesExplore All HivesBlogHome

© 2026 BrainyBee. Free AI-powered exam prep.