Study Guide920 words

Secure ML Infrastructure: VPCs, Subnets, and Security Groups

Building VPCs, subnets, and security groups to securely isolate ML systems

Secure ML Infrastructure: VPCs, Subnets, and Security Groups

Learning Objectives

After studying this guide, you should be able to:

  • Define the role of a Virtual Private Cloud (VPC) in isolating Machine Learning (ML) workflows.
  • Differentiate between Security Groups (stateful) and Network ACLs (stateless).
  • Explain how VPC Endpoints (Interface and Gateway) enable private access to AWS services like S3 and SageMaker.
  • Design a multi-tier network architecture to protect sensitive training data and model endpoints.

Key Terms & Glossary

  • VPC (Virtual Private Cloud): A logically isolated section of the AWS Cloud where you launch resources in a virtual network you define.
  • Subnet: A range of IP addresses in your VPC. ML resources are typically placed in private subnets to prevent direct internet access.
  • Security Group (SG): A virtual firewall for your instance (e.g., SageMaker Notebook) that controls inbound and outbound traffic.
  • Network Access Control List (NACL): An optional layer of security for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets.
  • AWS PrivateLink: Technology that provides private connectivity between VPCs, AWS services, and on-premises applications, securely on the Amazon network.

The "Big Idea"

Think of your ML infrastructure as a high-security research facility. The VPC is the outer perimeter fence. Subnets are different rooms or wings (some public for visitors, some private for researchers). NACLs are the security guards at the wing doors, checking everyone against a list. Security Groups are the electronic locks on individual office doors. To move data (the research) out safely without going through the public street (the Internet), you use a dedicated underground tunnel (VPC Endpoints).

Formula / Concept Box

FeatureSecurity Group (SG)Network ACL (NACL)
ScopeInstance/Resource levelSubnet level
StateStateful: Return traffic is automatically allowed.Stateless: Return traffic must be explicitly allowed.
RulesSupport "Allow" rules only.Support "Allow" and "Deny" rules.
EvaluationAll rules evaluated before deciding traffic.Rules processed in number order (lowest first).

Hierarchical Outline

  • Network Isolation (The Foundation)
    • VPC CIDR Blocks: Defining the IP address range (e.g., 10.0.0.0/16).
    • Subnet Segmentation: Dividing the VPC into Public (has route to IGW) and Private (no direct route) subnets.
  • Layered Defense (Firewalls)
    • NACLs: First line of defense at the subnet boundary; used for broad IP blocking.
    • Security Groups: Granular control; can reference other SGs (e.g., "Allow SageMaker to talk to RDS").
  • Private Connectivity (The Tunnels)
    • Interface Endpoints: Powered by PrivateLink; assigns a private IP to the service (e.g., SageMaker API).
    • Gateway Endpoints: Specific to S3 and DynamoDB; uses route table entries instead of private IPs.
  • Shared Responsibility
    • AWS: Security of the cloud (physical, global infrastructure).
    • Customer: Security in the cloud (VPC config, IAM, data encryption).

Visual Anchors

Traffic Flow Architecture

Loading Diagram...

Security Group vs. NACL Scope

\begin{tikzpicture}[node distance=2cm, every node/.style={rectangle, draw, rounded corners, minimum width=3cm, minimum height=1cm, align=center}]

code
% Draw the layers \draw[dashed, blue, thick] (-4,-3) rectangle (4,3); \node[text=blue] at (0, 2.7) {VPC}; \draw[thick, black] (-3.5,-2.5) rectangle (3.5,2); \node at (0, 1.7) {Subnet}; \node (nacl) [fill=orange!20] at (0, 0.8) {NACL (Stateless Filter)}; \draw[thick, red] (-2,-2) rectangle (2,0); \node at (0, -0.3) {EC2 / SageMaker Instance}; \node (sg) [fill=green!20] at (0, -1.2) {Security Group (Stateful)}; % Connectors \draw[<->] (nacl) -- (sg);

\end{tikzpicture}

Definition-Example Pairs

  • Stateful Firewall: A firewall that remembers the state of active connections.
    • Example: If you send a request from a SageMaker Notebook (outbound) to an API, the response (inbound) is automatically allowed back through the Security Group without a specific inbound rule.
  • Stateless Firewall: A firewall that treats every packet in isolation.
    • Example: In a NACL, if you allow Port 80 inbound, you must also create an outbound rule for the ephemeral port range (usually 1024-65535) for the response to reach the user.
  • VPC Interface Endpoint: An Elastic Network Interface (ENI) with a private IP address from your subnet's IP range.
    • Example: Accessing the SageMaker Runtime API from within a private VPC without the traffic ever touching the public internet.

Worked Examples

Scenario: Securing a SageMaker Training Job

Problem: You need to run a SageMaker Training job that pulls data from S3. The company policy forbids any traffic from traversing the public internet.

Solution Steps:

  1. VPC Setup: Create a VPC with a private subnet (no Internet Gateway attached).
  2. S3 Access: Create a VPC Gateway Endpoint for S3. Add a route in the private subnet's route table pointing S3 traffic to the endpoint ID (vpce-xxxx).
  3. SageMaker Config: When launching the training job, specify the SecurityGroupId and Subnets (Private Subnet IDs) in the VpcConfig parameter.
  4. Security Group Rules:
    • Inbound: None (unless specific debugging is needed).
    • Outbound: Allow HTTPS (Port 443) to the S3 Gateway Endpoint prefix list.

Checkpoint Questions

  1. Which AWS resource acts as a stateful firewall at the instance level?
  2. True or False: A single subnet can be associated with multiple NACLs simultaneously.
  3. Which two AWS services use Gateway Endpoints instead of Interface Endpoints?
  4. Why is a private subnet preferred for training ML models containing PII (Personally Identifiable Information)?

Muddy Points & Cross-Refs

  • Stateful vs. Stateless: This is the most common point of confusion. Remember: Security Groups are Smart (Stateful) — they remember your connection. NACLs are Not (Stateless) — they forget immediately.
  • Endpoint Types: If you see "S3" or "DynamoDB," think Gateway. For almost everything else (SageMaker, EC2 API, Kinesis), think Interface/PrivateLink.
  • Cross-Ref: For more on managing the keys used to encrypt this data, see the AWS KMS (Key Management Service) study guide.

Comparison Tables

FeatureInterface EndpointGateway Endpoint
Services SupportedMost AWS Services (SageMaker, etc.)Only S3 and DynamoDB
CostHourly charge + data processing chargeFree
ImplementationUses an ENI with a private IPUses a Route Table entry
TechnologyPowered by AWS PrivateLinkVPC Routing Mechanism

Ready to study AWS Certified Machine Learning Engineer - Associate (MLA-C01)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free