Study Guide945 words

AWS Systems Manager (SSM) Operations: Comprehensive Study Guide

AWS Systems Manager (SSM) Operations

AWS Systems Manager (SSM) Operations

AWS Systems Manager (SSM) is the central hub for operational management in AWS. It allows you to gain operational insights and take action on your AWS resources at scale, reducing the need for manual logins and repetitive tasks.

Learning Objectives

By the end of this module, you should be able to:

  • Explain the critical role of the SSM Agent in managing EC2 instances.
  • Execute and customize SSM Automation runbooks for operational remediation.
  • Design automated patching schedules using SSM Patch Manager.
  • Integrate SSM with Amazon EventBridge to respond to system state changes.
  • Understand how SSM supports external security services like Amazon Inspector.

Key Terms & Glossary

  • Managed Node: Any EC2 instance or on-premises server that is configured for use with Systems Manager. Requires the SSM Agent and an IAM role with appropriate permissions (e.g., AmazonSSMManagedInstanceCore).
  • SSM Agent: Software installed on managed nodes that processes requests from the Systems Manager service.
  • Runbook: A document (JSON or YAML) that defines the actions that Systems Manager performs on your managed nodes.
  • Maintenance Window: A defined schedule for when disruptive operations (like patching or reboots) can occur.
  • Patch Baseline: A set of rules that define which patches are approved for installation on your managed nodes.

The "Big Idea"

[!IMPORTANT] The core philosophy of SSM is Operations as Code. Instead of manually SSHing into 500 instances to check a configuration or install a patch, you define the desired state or action once and let SSM distribute that execution across your fleet securely and auditably.

Formula / Concept Box

Operational ComponentPrimary FunctionTrigger Mechanism
AutomationExecutes workflows (Runbooks)EventBridge, CLI, or Manual
Patch ManagerAutomates software updatesMaintenance Windows
Parameter StoreCentralized configuration dataAPI calls, CloudFormation
Fleet ManagerVisual node managementAWS Management Console

Hierarchical Outline

  1. The SSM Agent Layer
    • Core Dependency: Instances must have the agent installed and running to communicate with the SSM API.
    • IAM Permissions: Instances must assume an IAM role (via Instance Profile) to grant SSM permission to perform actions.
  2. Automation & Remediation
    • Predefined Runbooks: AWS-provided scripts for common tasks (e.g., AWS-RestartEC2Instance).
    • Custom Runbooks: Tailored workflows for complex logic and cross-service actions.
    • Event-Driven Actions: Using EventBridge to detect a state change (e.g., EC2 Instance Stop) and trigger an SSM Automation document to fix it.
  3. Fleet Maintenance at Scale
    • Patch Manager: Scans instances for missing patches and installs them based on a Patch Baseline.
    • Compliance: Reporting on which instances are compliant with your patching and configuration standards.
  4. Security Integration
    • Amazon Inspector: Uses the SSM Agent to collect data on Common Vulnerabilities and Exposures (CVEs).
    • AWS License Manager: Uses the agent to track software license consumption across the fleet.

Visual Anchors

SSM Automation Workflow

Loading Diagram...

SSM Architecture

\begin{tikzpicture} % Cloud boundary \draw[dashed, blue, thick] (-1,-1) rectangle (6,4); \node[blue] at (2.5, 3.7) {AWS Cloud};

code
% SSM Service \node[draw, fill=orange!20, minimum width=2cm, minimum height=1cm] (SSM) at (2.5, 2.5) {SSM Service}; % Managed Instances \node[draw, fill=green!20, minimum width=1.5cm] (EC2) at (0, 0) {EC2 Node}; \node[draw, fill=green!20, minimum width=1.5cm] (ONPREM) at (5, 0) {On-Prem}; % Agent indicator \node[draw, circle, scale=0.6, fill=gray!30] at (0, 0.5) {Agent}; \node[draw, circle, scale=0.6, fill=gray!30] at (5, 0.5) {Agent}; % Communication paths \draw[<->, thick] (SSM) -- (0, 1) node[midway, left] {HTTPS (443)}; \draw[<->, thick] (SSM) -- (5, 1) node[midway, right] {Hybrid Link};

\end{tikzpicture}

Definition-Example Pairs

  • SSM Document: A configuration file defining a set of steps.
    • Example: A document that checks if the httpd service is running and starts it if it is stopped.
  • State Manager: A tool to keep instances in a defined state (Configuration Management).
    • Example: Ensuring that a specific monitoring agent is installed on every instance tagged Production every 24 hours.
  • Inventory: A collection of metadata from managed instances.
    • Example: Querying the fleet to see which versions of Python are installed across 1,000 instances.

Worked Examples

Problem: Automating Instance Recovery

Scenario: You have a mission-critical application on EC2. If the hardware underlying the instance fails, you want it to recover automatically without manual intervention.

Step-by-Step Solution:

  1. Configure Status Check: Create an Amazon CloudWatch Alarm based on the StatusCheckFailed_System metric.
  2. Define Action: Within the CloudWatch Alarm, select EC2 Action.
  3. Recover Instance: Choose the "Recover this instance" action. This moves the instance to a new physical host while maintaining its ID, IP addresses, and EBS volume attachments.
  4. SSM Integration: Alternatively, use EventBridge to trigger an SSM Automation Runbook (like AWS-RestartEC2Instance) when the alarm state changes to ALARM.

Problem: Patching a Fleet with Different OS Types

Scenario: You have a mix of Amazon Linux 2 and Windows Server 2022 instances.

Solution:

  1. Create two Patch Baselines: one for Linux (approving 'Security' patches with 7 days delay) and one for Windows (approving 'Critical' patches immediately).
  2. Use Patch Groups: Tag Linux instances with PatchGroup: LinuxFleet and Windows with PatchGroup: WinFleet.
  3. Associate the baselines with the respective tags.
  4. Schedule a Maintenance Window for Saturday at 2:00 AM to run the AWS-RunPatchBaseline document.

Checkpoint Questions

  1. What is the minimum requirement for an EC2 instance to be seen in the SSM Console?
    • Answer: It must have the SSM Agent installed/running and an IAM Instance Profile with permissions to communicate with the SSM service.
  2. How does Amazon Inspector use SSM to identify vulnerabilities?
    • Answer: Inspector uses the SSM Agent to collect software inventory and CVE data from the operating system of the managed node.
  3. Which SSM feature would you use to store a database password securely?
    • Answer: SSM Parameter Store (using the SecureString type).
  4. What is the difference between a Patch Baseline and a Maintenance Window?
    • Answer: A Patch Baseline defines what patches are approved; a Maintenance Window defines when those patches (or other tasks) are allowed to run.

Ready to study AWS Certified CloudOps Engineer - Associate (SOA-C03)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free