Study Guide945 words

AWS Auto Scaling Policies and Events: Master Study Guide

Auto scaling policies and events

AWS Auto Scaling Policies and Events

This guide explores the mechanisms provided by AWS to ensure workloads adapt to changing demand through automated scaling policies, covering EC2, ECS, serverless services, and Kubernetes environments.

Learning Objectives

After studying this guide, you should be able to:

  • Distinguish between dynamic scaling and predictive scaling.
  • Identify which AWS services support native Auto Scaling versus built-in serverless scaling.
  • Configure effective scaling thresholds based on resource utilization metrics.
  • Understand the specific scaling mechanisms for Kubernetes (EKS) including Karpenter and HPA/VPA.
  • Apply 14-day historical data analysis for predictive capacity planning.

Key Terms & Glossary

  • ASG (Auto Scaling Group): A logical grouping of EC2 instances for purposes of management and scaling.
  • Scale-Out: The process of adding resources (e.g., instances) to handle increased demand.
  • Scale-In: The process of removing resources to save costs when demand decreases.
  • Target Tracking Policy: A scaling policy that increases or decreases capacity to maintain a specific metric at a target value (e.g., maintain average CPU at 50%).
  • Step Scaling: A policy that scales capacity based on the size of the alarm breach (e.g., add 2 instances if CPU is 70%, but add 4 if it hits 90%).
  • Predictive Scaling: A mechanism that uses machine learning to forecast future traffic and schedule capacity changes in advance.

The "Big Idea"

In cloud architecture, Elasticity is the goal. Scaling should not just be about "growing big," but about "matching the curve." By automating scaling events, organizations avoid over-provisioning (wasted money) and under-provisioning (dropped traffic), ensuring that the infrastructure footprint perfectly mirrors real-time user demand.

Formula / Concept Box

ConceptMetric RequirementPredictive Window
Dynamic ScalingReal-time (CloudWatch)Instantaneous response
Predictive Scaling14 days of historyForecasts next 48 hours
Lambda ScalingConcurrency QuotasAutomatic / Internal

[!IMPORTANT] Predictive scaling is currently only available for Amazon EC2 Auto Scaling Groups.

Hierarchical Outline

  1. Scaling Methodologies
    • Manual Scaling: Human intervention (rarely recommended for production).
    • Dynamic Scaling: Reactionary; responds to CloudWatch alarms.
    • Predictive Scaling: Proactive; uses ML for cyclic patterns.
  2. Service-Specific Scaling
    • EC2/ECS: Uses Auto Scaling Groups and Service Auto Scaling.
    • DynamoDB: Supports both On-Demand (serverless) and Provisioned (with Auto Scaling).
    • EKS (Kubernetes):
      • Cluster Level: Cluster Autoscaler or Karpenter.
      • Pod Level: Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA).
  3. Serverless Scaling (Built-in)
    • S3 & Lambda: Scale automatically without manual policy configuration.
    • Quotas: Critical to monitor Service Quotas to prevent scaling caps.

Visual Anchors

Scaling Logic Flow

Loading Diagram...

Demand vs. Capacity (TikZ)

\begin{tikzpicture}[scale=0.8] % Axes \draw[->] (0,0) -- (6,0) node[right] {Time}; \draw[->] (0,0) -- (0,4) node[above] {Capacity/Demand};

code
% Demand Curve (Sine-like) \draw[thick, blue] plot [domain=0.4:5.6, samples=100] (\x, {2 + sin(\x*120)}); \node[blue] at (5.5, 3.2) {Demand}; % Capacity Steps (Dynamic) \draw[thick, red, dashed] (0.4, 2.5) -- (2, 2.5) -- (2, 3.5) -- (3.5, 3.5) -- (3.5, 2.2) -- (5, 2.2) -- (5, 1.5); \node[red] at (1.5, 3.8) {Provisioned Capacity};

\end{tikzpicture}

Definition-Example Pairs

  • Cooldown Period: A configurable time where the ASG waits for previous scaling actions to take effect before scaling again.
    • Example: After launching an EC2 instance, waiting 300 seconds for the application to boot before checking if CPU is still high.
  • Scale-In Protection: A setting that prevents specific instances from being terminated during a scale-in event.
    • Example: Preventing the termination of an instance currently processing a long-running batch job even if the average CPU is low.

Worked Examples

Example 1: Calculating Scale-Out

Scenario: An ASG has a minimum of 2 instances and a maximum of 10. The policy is to add 50% more capacity when CPU exceeds 70%.

  1. Current State: 4 instances running.
  2. Event: CPU hits 85%.
  3. Calculation: $4 \times 0.50 = 2$ additional instances.
  4. Result: ASG scales out to 6 instances.

Example 2: Predictive Scaling Setup

Scenario: A retail site sees massive spikes every Monday at 9:00 AM.

  1. Analysis: AWS Auto Scaling monitors the site for 14 days.
  2. Forecast: It predicts the 9:00 AM spike for the upcoming Monday.
  3. Action: It starts launching instances at 8:45 AM so capacity is warm before the traffic arrives.

Checkpoint Questions

  1. What is the minimum amount of historical data required for Predictive Scaling to generate a forecast?
  2. Which tool is used in EKS to scale EC2 Nodes efficiently by bypassng the standard ASG overhead?
  3. If an application is "Serverless," do you still need to configure Auto Scaling policies?
  4. Why might you combine Predictive and Dynamic scaling policies?

Muddy Points & Cross-Refs

  • EKS Scaling Confusion: Users often confuse HPA (scaling pods) with Karpenter (scaling nodes). Think of HPA as "buying more groceries" and Karpenter as "buying a bigger fridge."
  • Serverless Limits: While Lambda scales automatically, it is subject to Account Concurrency Limits (usually 1,000 per region). Cross-reference with "Service Quotas" documentation.
  • Cooldown vs. Warm-up: Cooldown is for the whole group; warm-up is for the individual instance being added.

Comparison Tables

Dynamic vs. Predictive Scaling

FeatureDynamic ScalingPredictive Scaling
MechanismReactive (Alarms)Proactive (ML Forecast)
Data SourceReal-time CloudWatch Metrics14-day Historical Baseline
Best ForUnpredictable burstsCyclic, repeating patterns
Service SupportEC2, ECS, DynamoDB, AuroraEC2 ASGs only

EKS Scaling Tools

ToolLevelWhat it Scales
HPAPodHorizontal count of Pods based on CPU/RAM
VPAPodVertical sizing (CPU/RAM) of existing Pods
KarpenterInfrastructureProvisions right-sized EC2 nodes directly
Cluster AutoscalerInfrastructureAdjusts ASG sizes to fit pending Pods

Ready to study AWS Certified Solutions Architect - Professional (SAP-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free