Study Guide1,050 words

Configuring SageMaker AI Endpoints within VPC Networks

Configuring SageMaker AI endpoints within the VPC network

Configuring SageMaker AI Endpoints within VPC Networks

This study guide covers the critical infrastructure and networking requirements for deploying Amazon SageMaker AI endpoints securely within an Amazon Virtual Private Cloud (VPC). Integrating SageMaker with a VPC ensures that inference traffic and data movement remain within the AWS private network, minimizing exposure to the public internet.


Learning Objectives

By the end of this module, you should be able to:

  • Differentiate between VPC Interface Endpoints and Gateway Endpoints.
  • Configure SageMaker endpoints to reside within private subnets.
  • Implement security measures including Security Groups and Network ACLs for ML infrastructure.
  • Define Autoscaling policies based on specific metrics like SageMakerVariantInvocationsPerInstance.
  • Ensure data protection through encryption at rest and in transit using KMS and ACM.

Key Terms & Glossary

  • VPC Interface Endpoint: Powered by AWS PrivateLink, it provides private connectivity to AWS services using private IP addresses from your VPC subnets.
  • VPC Gateway Endpoint: A gateway specified in your route table that provides private connectivity to S3 and DynamoDB.
  • AWS PrivateLink: A technology that provides high-availability, scalable connectivity between VPCs and AWS services without using public IPs.
  • Security Group: A stateful virtual firewall that controls inbound and outbound traffic at the instance/resource level.
  • Network ACL (NACL): An optional, stateless layer of security for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets.

The "Big Idea"

In a standard configuration, SageMaker endpoints communicate over the public internet. However, for enterprise-grade security and compliance (e.g., HIPAA or PCI), machine learning models must be isolated. By Configuring SageMaker AI endpoints within the VPC network, you transform the endpoint into a private resource accessible only from within your network or through secure VPN/Direct Connect links. This architecture eliminates the "public internet" leg of the journey, significantly reducing the attack surface.


Formula / Concept Box

ComponentPurposeCore Requirement
Scaling TargetDefines the boundariesMin/Max instance count
Scaling PolicyDefines the logicMetric (CPU/Invocations) + Target Value
Endpoint ConfigResource DefinitionInstance Type + Model Artifact Location
KMS KeyData ProtectionEnable encryption for at-rest storage

[!IMPORTANT] For SageMaker to access S3 buckets privately, you must have either a VPC Gateway Endpoint for S3 or an S3 Interface Endpoint configured in your VPC.


Hierarchical Outline

  • VPC Connectivity Patterns
    • Interface Endpoints (PrivateLink): Uses Elastic Network Interfaces (ENIs) with private IPs.
    • Gateway Endpoints: Specific to S3 and DynamoDB; requires route table updates.
  • SageMaker Endpoint Architecture
    • Model Artifacts: Stored in S3, retrieved during deployment.
    • Serving Container: Hosted on ECR; loaded into the inference instance.
    • Endpoint Configuration: Defines instance type (e.g., ml.m5.large) and count.
  • Security and Isolation
    • Subnet Placement: Deploying endpoints in private subnets with no IGW (Internet Gateway) access.
    • Security Groups: Restricting port 443 (HTTPS) to specific CIDR blocks or application security groups.
  • Elasticity and Scaling
    • Metrics: Monitoring CPUUtilization or InvocationsPerInstance.
    • Actions: Scaling out (adding replicas) or scaling in (removing replicas).

Visual Anchors

Inference Traffic Flow in VPC

Loading Diagram...

Network Isolation Architecture

\begin{tikzpicture}[node distance=2cm, every node/.style={rectangle, draw, rounded corners, minimum width=2.5cm, minimum height=1cm, align=center}]

code
% VPC Boundary \draw[dashed, blue, thick] (-1,-3) rectangle (8,3); \node at (3.5, 2.7) {\textbf{Virtual Private Cloud (VPC)}}; % Nodes \node (App) at (1, 1) {Client Application\$EC2/Lambda)}; \node (SG) at (4, 1) {Security Group\$Firewall)}; \node (EP) at (7, 1) {SageMaker\\Endpoint ENI}; \node (S3) at (3.5, -2) {S3 Gateway\\Endpoint}; % Flows \draw[->, thick] (App) -- (SG); \draw[->, thick] (SG) -- (EP); \draw[->, thick] (EP) |- (S3); \node[draw=none, fill=none, text width=3cm] at (7, -1) {\small Private traffic only};

\end{tikzpicture}


Definition-Example Pairs

  • Scaling Target: The definition of the resource and its capacity limits.
    • Example: Setting a SageMaker real-time endpoint to have a minimum of 2 instances for high availability and a maximum of 10 for peak loads.
  • Stateless Firewall (NACL): A security layer that does not remember the state of requests.
    • Example: Explicitly blocking a specific range of malicious IP addresses from reaching the entire subnet where your ML models are hosted.
  • Model Artifact: The output files from a training job used for inference.
    • Example: A .tar.gz file in S3 containing the weights and parameters of a XGBoost model.

Worked Examples

Example 1: Configuring an Interface Endpoint for SageMaker Runtime

To allow an EC2 instance in a private subnet to call InvokeEndpoint without internet access:

  1. Navigate to VPC Dashboard > Endpoints.
  2. Create Endpoint: Select "AWS Services".
  3. Service Name: Search for com.amazonaws.<region>.sagemaker.runtime.
  4. VPC & Subnets: Select the VPC and the private subnets where your app resides.
  5. Security Group: Select an SG that allows inbound traffic on port 443 from your application server.
  6. Policy: Use Full Access or a custom IAM policy to restrict who can use this endpoint.

Example 2: Setting up Boto3 Autoscaling

python
import boto3 client = boto3.client('application-autoscaling') # Register the endpoint as a scalable target client.register_scalable_target( ServiceNamespace='sagemaker', ResourceId='endpoint/my-model-endpoint/variant/AllTraffic', ScalableDimension='sagemaker:variant:DesiredInstanceCount', MinCapacity=2, MaxCapacity=10 )

Checkpoint Questions

  1. Which AWS service is used to create private connections to SageMaker AI without traversing the public internet?
  2. What is the difference between a scaling target and a scaling policy in SageMaker autoscaling?
  3. Why is it recommended to use a VPC Gateway Endpoint for S3 when working with SageMaker endpoints?
  4. Is a Security Group stateful or stateless?
Click to see answers
  1. AWS PrivateLink (via VPC Interface Endpoints).
  2. A scaling target defines the min/max boundaries; a scaling policy defines the metric and logic for when to trigger a change.
  3. It provides a free, highly available way to route traffic to S3 (where models are stored) without using an NAT Gateway or public IP.
  4. Stateful.

Muddy Points & Cross-Refs

  • Interface vs. Gateway: Interface endpoints (PrivateLink) cost money per hour + per GB. Gateway endpoints (S3/DynamoDB) are free. Always check if a Gateway option exists first.
  • Inbound vs. Outbound: When a SageMaker endpoint is in a VPC, you need to ensure the Security Group allows Inbound traffic from your clients, and the model's IAM role has Outbound permissions to S3 and ECR.
  • Cross-Ref: See "Chapter 7: Monitoring" for details on using CloudWatch to track the metrics used in these scaling policies.

Comparison Tables

Security Groups vs. Network ACLs

FeatureSecurity Group (SG)Network ACL (NACL)
LevelInstance/Resource LevelSubnet Level
StateStateful (Returns allowed)Stateless (Returns must be explicit)
RulesAllow rules onlyAllow and Deny rules
EvaluationAll rules evaluatedRules evaluated in order (lowest # first)

Interface vs. Gateway Endpoints

FeatureInterface EndpointGateway Endpoint
MechanismElastic Network Interface (ENI)Route Table Entry
CostHourly charge + Data processedFree
ServicesMost AWS Services (including SageMaker)S3 and DynamoDB only
AccessPrivate IP AddressPublic IP prefix list via Gateway

Ready to study AWS Certified Machine Learning Engineer - Associate (MLA-C01)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free