Automating Compute Provisioning: AWS CloudFormation and AWS CDK
Automating the provisioning of compute resources, including communication between stacks (for example, by using CloudFormation, AWS CDK)
Automating Compute Provisioning: AWS CloudFormation and AWS CDK
This guide covers the automation of cloud infrastructure, a critical skill for the AWS Certified Machine Learning Engineer Associate (MLA-C01) exam. It focuses on using Infrastructure as Code (IaC) to provision compute resources and managing the communication between disparate resource stacks.
Learning Objectives
After studying this guide, you should be able to:
- Define Infrastructure as Code (IaC) and its benefits for ML workflows.
- Compare and contrast AWS CloudFormation and the AWS Cloud Development Kit (CDK).
- Explain the hierarchy of CDK Constructs (L1, L2, L3).
- Describe how to implement inter-stack communication using cross-stack references.
- Identify the steps in the CDK deployment lifecycle (Synthesis, Deployment, Diff).
Key Terms & Glossary
- Infrastructure as Code (IaC): The practice of managing and provisioning computing infrastructure through machine-readable definition files rather than physical hardware configuration or interactive configuration tools.
- Stack: A unit of deployment in CloudFormation; a collection of AWS resources that can be managed as a single unit.
- Construct: The basic building block of AWS CDK apps, representing one or more AWS resources.
- Synthesis (Synth): The process of executing CDK code to produce a CloudFormation template.
- Cross-Stack Reference: A method in CloudFormation to export a value from one stack so it can be used by another stack in the same region.
- Change Set: A preview of changes CloudFormation will make to your stack before you apply them.
The "Big Idea"
In modern Machine Learning, reproducibility isn't just about your code or data—it's about the environment. By treating infrastructure as code, you ensure that the complex clusters, GPU instances, and networking required for training models are identical across development, staging, and production. This eliminates the "it worked on my machine" problem and allows for automated scaling and disaster recovery.
Formula / Concept Box
| Process / Action | Tool/Command | Description |
|---|---|---|
| Preview Changes | cdk diff / CFN Change Sets | Compares the proposed code against the currently deployed state. |
| Generate Template | cdk synth | Converts high-level code (Python/TS) into a CloudFormation YAML/JSON template. |
| Deploy Resources | cdk deploy | Provisions the resources into your AWS account. |
| Inter-stack Linking | Fn::ImportValue | The CloudFormation function used to consume an exported value from another stack. |
Hierarchical Outline
- Infrastructure as Code (IaC) Fundamentals
- Declarative (CloudFormation): Defining what the end state should look like.
- Imperative/Programmatic (CDK): Defining how to build it using logic (loops, conditions).
- AWS CloudFormation
- Templates: Written in YAML or JSON.
- Management: Handles rollbacks if a deployment fails.
- Portability: Templates can be reused across regions and accounts.
- AWS Cloud Development Kit (CDK)
- Supported Languages: Python, TypeScript, Java, C#, Go.
- Construct Levels:
- L1 (Cfn Resources): Direct mapping to CloudFormation resources.
- L2 (Curated): Includes sensible defaults and best-practice security settings.
- L3 (Patterns): High-level abstractions for common architectures (e.g., Load Balanced Fargate Service).
- Inter-Stack Communication
- Exports: Defining an output in a template with an
Exportname. - Imports: Using the
ImportValuefunction in a separate stack to link resources (e.g., using a VPC defined in a Network Stack for a SageMaker endpoint in an ML Stack).
- Exports: Defining an output in a template with an
Visual Anchors
CDK Development Workflow
Cross-Stack Reference Architecture
\begin{tikzpicture}[node distance=2cm, every node/.style={rectangle, draw, minimum width=3cm, minimum height=1cm, align=center}] \node (StackA) [fill=blue!10] {\textbf{Network Stack} \ (VPC, Subnets)}; \node (Export) [below of=StackA, node distance=1.5cm, draw=none] {\textit{Export: VPC-ID}}; \node (StackB) [right of=StackA, xshift=4cm, fill=green!10] {\textbf{ML Stack} \ (SageMaker Endpoint)};
\draw[->, thick] (StackA) -- (StackB) node[midway, above] {ImportValue};
\end{tikzpicture}
Definition-Example Pairs
- Concept: Change Set
- Definition: A summary of proposed changes to a CloudFormation stack.
- Example: Before updating a production SageMaker endpoint, you generate a Change Set to ensure the update won't accidentally delete and recreate the underlying S3 bucket containing model artifacts.
- Concept: L2 Construct
- Definition: Higher-level abstractions that provide defaults and boilerplate code.
- Example: Instead of defining an S3 bucket, a Bucket Policy, and Encryption settings individually, using the CDK
s3.Bucketconstruct automatically applies secure encryption by default.
Worked Examples
Example 1: Declarative CloudFormation (YAML)
This snippet creates a simple S3 bucket for model storage.
Resources:
ModelArtifactBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: !Sub "ml-models-${AWS::AccountId}"
VersioningConfiguration:
Status: EnabledExample 2: Programmatic CDK (Python)
The same bucket defined in CDK allows for easier integration with application logic.
from aws_cdk import aws_s3 as s3, core
class MlStack(core.Stack):
def __init__(self, scope: core.Construct, id: str, **kwargs):
super().__init__(scope, id, **kwargs)
s3.Bucket(self, "ModelArtifactBucket",
versioned=True,
removal_policy=core.RemovalPolicy.DESTROY
)Checkpoint Questions
- What is the primary difference between a declarative and an imperative approach to IaC?
- Which CDK command is used to generate the CloudFormation template from your code?
- What happens if one resource in a CloudFormation stack fails to provision during an update?
- Why might you use Cross-Stack References instead of putting all resources in one giant stack?
Muddy Points & Cross-Refs
- CDK vs. CloudFormation: New users often think CDK replaces CloudFormation. It does not; CDK uses CloudFormation as its engine. You still need to understand CloudFormation error messages to debug failed CDK deployments.
- Circular Dependencies: A common "muddy point" in cross-stack communication. If Stack A depends on Stack B, and Stack B depends on Stack A, CloudFormation will fail. Use a shared common stack or parameters to resolve this.
- Resource Retention: Note that deleting a stack might not delete all resources (e.g., S3 buckets with data). Use
RemovalPolicyin CDK orDeletionPolicyin CloudFormation to control this.
Comparison Tables
CloudFormation vs. AWS CDK
| Feature | AWS CloudFormation | AWS CDK |
|---|---|---|
| Language | JSON / YAML | Python, TS, Java, etc. |
| Abstractions | Low (Mapping 1:1 to resources) | High (L1, L2, L3 Constructs) |
| Logic | Limited (If/Else, Mappings) | Full programming logic (Loops, Classes) |
| Maintainability | Can become very long (thousands of lines) | Modular, reusable libraries |
| Target Audience | SysAdmins, DevOps Engineers | Developers, ML Engineers |