Integrating External Models with Amazon SageMaker AI
Methods to integrate models that were built outside SageMaker AI into SageMaker AI
Integrating External Models with Amazon SageMaker AI
This guide explores the methodologies for bringing machine learning models developed outside the SageMaker ecosystem into the managed AWS environment. It covers the "Bring Your Own Model" (BYOM) workflow, containerization strategies, and deployment options.
Learning Objectives
After studying this guide, you should be able to:
- Identify the core components required to package an external model for SageMaker.
- Differentiate between Pre-built Containers and Custom Containers (BYOC).
- Explain the role of the
model.tar.gzartifact in the deployment process. - Utilize the SageMaker Model Registry to manage versions of externally trained models.
- Select the appropriate inference type (Real-time, Batch, Asynchronous, Serverless) for integrated models.
Key Terms & Glossary
- Model Artifact: The serialized state of a trained model (e.g., a
.pthor.pklfile) packaged as a compressed archive. - Inference Code: The script (often named
inference.py) that contains the logic to load the model and handle prediction requests. - BYOC (Bring Your Own Container): The process of building a Docker image with specific dependencies not available in standard SageMaker frameworks.
- SageMaker Model Registry: A central repository to version, track, and manage the approval workflow of models.
- SageMaker Neo: An optimization service that compiles models for specific hardware to reduce latency.
The "Big Idea"
Amazon SageMaker AI is designed as an open platform. While it provides high-performance built-in algorithms, its true power lies in its ability to act as a managed orchestration layer for any model. Whether you trained a model on your local laptop, an on-premises cluster, or in another cloud, SageMaker allows you to wrap that model in a standardized container, assign it managed compute resources, and benefit from enterprise features like autoscaling, monitoring, and security without re-architecting the model itself.
Formula / Concept Box
| Component | Requirement | Description |
|---|---|---|
| Model Artifacts | model.tar.gz | Must contain the trained weights/parameters. |
| Docker Image | Registry Path | A URI for an ECR image (Pre-built or Custom). |
| Inference Script | Entry Point | Python script defining model_fn, input_fn, and predict_fn. |
| Environment | IAM Role | Permissions to access S3 buckets and ECR images. |
Hierarchical Outline
- I. Model Preparation
- Serialization: Saving the model in a format compatible with the target framework (e.g., Pickle, Joblib, TensorFlow SavedModel).
- Artifact Packaging: Compressing the model files into a single
model.tar.gzuploaded to Amazon S3.
- II. Containerization Strategies
- Pre-built Containers: SageMaker-maintained images for PyTorch, TensorFlow, Scikit-Learn, and Hugging Face.
- Custom Containers (BYOC): Building Dockerfiles to support specialized libraries or non-standard languages (e.g., R, Julia, C++).
- III. Integration Mechanisms
- Script Mode: Passing custom Python code to a pre-built container at runtime.
- AWS Marketplace: Purchasing and deploying third-party pre-trained model packages.
- Model Registry: Formalizing the external model as a versioned asset within SageMaker.
- IV. Deployment Modes
- Real-time: Persistent endpoints for low-latency needs.
- Serverless: On-demand scaling with no cold-start management.
- Batch Transform: High-throughput processing for large datasets offline.
Visual Anchors
The BYOM Workflow
Model Package Components
\begin{tikzpicture}[node distance=2cm] \draw[thick, fill=blue!10] (0,0) rectangle (4,3) node[midway, yshift=1.2cm] {\textbf{SageMaker Model Package}}; \draw[thick, fill=green!10] (0.5,0.5) rectangle (3.5,1.2) node[midway] {Model Artifacts (S3)}; \draw[thick, fill=orange!10] (0.5,1.4) rectangle (3.5,2.1) node[midway] {Inference Image (ECR)}; \draw[thick, fill=purple!10] (0.5,2.3) rectangle (3.5,2.8) node[midway] {IAM Execution Role}; \draw[->, thick] (4.2, 1.5) -- (6, 1.5) node[right] {SageMaker Endpoint}; \end{tikzpicture}
Definition-Example Pairs
- Script Mode: A method to use SageMaker's pre-built containers while supplying your own training or inference logic.
- Example: You have a PyTorch model trained on a local GPU. You use the SageMaker PyTorch container but provide an
inference.pyscript to handle a specific JSON input format.
- Example: You have a PyTorch model trained on a local GPU. You use the SageMaker PyTorch container but provide an
- AWS Marketplace for ML: A digital catalog where third-party vendors sell pre-trained models.
- Example: Purchasing a specialized OCR model for legal documents from a vendor and deploying it directly to a SageMaker endpoint without writing training code.
- Model Serialization: Converting a data structure or object state into a format that can be stored and reconstructed later.
- Example: Using
joblib.dump(model, 'model.joblib')to save a Scikit-Learn classifier before zipping it for S3.
- Example: Using
Worked Example: Integrating a Scikit-Learn Model
Scenario: You have a random_forest.joblib file trained on your local machine and want to host it on SageMaker for real-time predictions.
- Prepare the Artifact:
Create a directory containing the model.
bash
tar -cvzf model.tar.gz random_forest.joblib aws s3 cp model.tar.gz s3://my-bucket/models/model.tar.gz - Write the Inference Script (
inference.py):pythonimport joblib import os def model_fn(model_dir): # SageMaker extracts model.tar.gz into /opt/ml/model/ return joblib.load(os.path.join(model_dir, "random_forest.joblib")) def predict_fn(input_data, model): return model.predict(input_data) - Define the SageMaker Model:
Use the Python SDK to link the S3 path, the pre-built Scikit-Learn container, and your script.
python
from sagemaker.sklearn.model import SKLearnModel model = SKLearnModel( model_data="s3://my-bucket/models/model.tar.gz", role="MySageMakerRole", entry_point="inference.py", framework_version="1.2-1" ) - Deploy:
python
predictor = model.deploy(instance_type="ml.m5.large", initial_instance_count=1)
Checkpoint Questions
- What is the mandatory file name/format for the model artifacts uploaded to S3?
- In Script Mode, which function in your Python script is responsible for loading the model from disk into memory?
- When should you choose a Custom Container (BYOC) over a SageMaker Pre-built container?
- How does the SageMaker Model Registry help when integrating models from different teams?
[!TIP] Answer to Q2: The
model_fn(model_dir)function is the entry point used by the SageMaker inference toolkit to load your model.
Muddy Points & Cross-Refs
- Artifact Extraction: Users often get confused about where files go. SageMaker automatically decompresses
model.tar.gzinto/opt/ml/model/inside the container. Your code must look for files relative to that path. - Environment Variables: To pass custom settings to your integrated model, use the
envparameter in theModelobject; these become standard Linux environment variables inside the container. - Dependency Management: If your script needs extra libraries (e.g.,
pandas), include arequirements.txtfile in the same source directory as yourinference.py. SageMaker willpip installthem automatically.
Comparison Tables
Deployment Strategy Comparison
| Feature | Real-time Endpoint | Asynchronous Inference | Serverless Inference | Batch Transform |
|---|---|---|---|---|
| Latency | Milliseconds | Seconds/Minutes | Milliseconds (Cold start possible) | N/A (Offline) |
| Payload Size | Up to 6 MB | Up to 1 GB | Up to 4 MB | Large files/S3 |
| Best For | User-facing apps | Large images/Large LLM outputs | Spiky/Infrequent traffic | Bulk data processing |
| Cost | Hourly per instance | Hourly per instance | Per request | Per job duration |
Pre-built vs. Custom Containers
| Pre-built Containers | Custom Containers (BYOC) | |
|---|---|---|
| Effort | Low (Managed by AWS) | High (Developer managed) |
| Control | Standard frameworks only | Total control over OS and libraries |
| Updates | Automatic security patches | Manual maintenance required |
| Use Case | TensorFlow, PyTorch, SKLearn | R, C++, custom proprietary libs |