Mastering Infrastructure as Code (IaC): AWS CloudFormation vs. AWS CDK
Tradeoffs and use cases of infrastructure as code (IaC) options (for example, AWS CloudFormation, AWS Cloud Development Kit [AWS CDK])
Mastering Infrastructure as Code (IaC): AWS CloudFormation vs. AWS CDK
Infrastructure as Code (IaC) allows developers and ML engineers to treat infrastructure with the same rigor as application code—version-controlled, repeatable, and automated. This guide explores the two primary AWS native tools for IaC: CloudFormation and the Cloud Development Kit (CDK).
Learning Objectives
After studying this guide, you should be able to:
- Differentiate between declarative (CloudFormation) and imperative (CDK) IaC approaches.
- Select the appropriate tool based on team expertise and project requirements.
- Explain the three levels of CDK constructs (L1, L2, L3).
- Identify key AWS CDK CLI commands for the deployment lifecycle.
Key Terms & Glossary
- Infrastructure as Code (IaC): Managing and provisioning computing resources through machine-readable configuration files.
- Declarative: Defining the what (the final state) without specifying the how (the steps).
- Imperative: Defining the how using logic, loops, and conditions to generate the final state.
- Constructs: The basic building blocks of AWS CDK apps; they can represent a single resource or multiple resources.
- Synthesis (Synth): The process of executing CDK code to produce a CloudFormation template.
The "Big Idea"
IaC solves the "manual console problem." In a complex Machine Learning environment, manually clicking through the AWS Console to create SageMaker endpoints, S3 buckets, and IAM roles is error-prone and unscalable. IaC makes infrastructure idempotent—running the same code twice results in the exact same infrastructure, ensuring consistency between Development, Staging, and Production environments.
Formula / Concept Box
The CDK Toolkit (CLI) Workflow
| Command | Action Description |
|---|---|
cdk synth | Translates programmatic code into a CloudFormation YAML/JSON template. |
cdk diff | Compares the local code against the currently deployed stack. |
cdk deploy | Provisions the resources in the AWS account. |
cdk destroy | Deletes the stack and its associated resources to stop costs. |
Hierarchical Outline
- Core IaC Options
- AWS CloudFormation: The foundational engine (JSON/YAML based).
- AWS CDK: An abstraction layer (Programming language based).
- Terraform: Multi-cloud alternative (HCL based).
- The CloudFormation Approach
- Declarative State: You describe the destination, AWS handles the journey.
- Maturity: Highly established with massive community template libraries.
- The AWS CDK Approach
- Programming Languages: Python, TypeScript, Java, C#, Go.
- Logic & Modularity: Supports loops,
ifstatements, and Object-Oriented patterns. - Construct Levels: L1 (Cfn resources), L2 (Sensible defaults), L3 (Architectural patterns).
Visual Anchors
The CDK Workflow
Resource Abstraction Layers
\begin{tikzpicture}[node distance=1.5cm] \draw[thick, fill=blue!10] (0,0) rectangle (6,1) node[midway] {L1: Low-Level (Direct CFN Mapping)}; \draw[thick, fill=blue!20] (0,1.2) rectangle (6,2.2) node[midway] {L2: Mid-Level (Sensible Defaults)}; \draw[thick, fill=blue!30] (0,2.4) rectangle (6,3.4) node[midway] {L3: Patterns (Full Architectures)}; \node at (3,-0.5) {CDK Construct Levels}; \end{tikzpicture}
Definition-Example Pairs
- Declarative Approach: Defining infrastructure as a static document.
- Example: A YAML file stating "I want one S3 bucket named 'my-data-bucket' with versioning enabled."
- Imperative/Programmatic Approach: Using logic to generate infrastructure.
- Example: A Python loop that creates 10 different S3 buckets, each named based on an item in a list, with specific permissions assigned dynamically.
- L3 Construct (Pattern): High-level abstractions for common architectures.
- Example: Using
ApplicationLoadBalancedFargateServiceto deploy a container, load balancer, and cluster in one command.
- Example: Using
Worked Examples
Comparison: Provisioning an S3 Bucket
Option A: AWS CloudFormation (YAML)
Resources:
MyBucket:
Type: 'AWS::S3::Bucket'
Properties:
BucketName: my-standard-bucket
VersioningConfiguration:
Status: EnabledOption B: AWS CDK (Python)
from aws_cdk import aws_s3 as s3
# Creating the same bucket with CDK (L2 Construct)
bucket = s3.Bucket(self, "MyBucket",
bucket_name="my-standard-bucket",
versioned=True
)[!TIP] Notice how the CDK code looks like standard application code. You can use
.grant_read(user)methods in CDK, whereas in CloudFormation, you would need to write a separate complex IAM Policy block.
Checkpoint Questions
- Which tool is preferred if your team wants to use standard IDE debugging tools and unit testing for infrastructure?
- What is the result of the
cdk synthcommand? - If you need to see exactly what will change in your environment before applying it, which CLI command should you run?
- Why is a "Declarative" approach sometimes considered easier for non-developers?
Muddy Points & Cross-Refs
- State Management: While CDK feels imperative, it is actually generating a declarative template. The "state" is still managed by the CloudFormation engine on the backend.
- Circular Dependencies: Both tools can run into issues where Resource A depends on B, and B depends on A. In CDK, these are often caught during
cdk synth. - Deep Dive: To learn more about how this fits into ML specifically, see documentation on SageMaker Pipelines for model-specific orchestration.
Comparison Tables
| Feature | AWS CloudFormation | AWS CDK |
|---|---|---|
| Language | JSON / YAML | Python, TS, Java, Go, C# |
| Approach | Declarative | Imperative (Logic-driven) |
| Learning Curve | High for complex logic | Low for developers |
| Reusability | Nested Stacks (Complex) | Object-Oriented Classes (Easy) |
| Debugging | CloudFormation Error Logs | Language-native debuggers |
| Abstraction | None (1:1 with AWS) | Construct levels (L1, L2, L3) |