Hands-On Lab920 words

Lab: Building Self-Healing Infrastructure for Operational Excellence

Determine a strategy to improve overall operational excellence

Lab: Building Self-Healing Infrastructure for Operational Excellence

Operational Excellence is the cornerstone of the AWS Well-Architected Framework. In this lab, you will transition from manual monitoring to automated remediation. You will use AWS Config to detect configuration drift (specifically missing tags) and AWS Systems Manager (SSM) to automatically fix the issue without human intervention. This reflects the "Operations as Code" and "Refine Operations Procedures Frequently" design principles.

Prerequisites

  • AWS Account: You must have an active AWS account with permissions to create IAM Roles, EC2 instances, AWS Config rules, and S3 buckets.
  • AWS CLI: Installed and configured with AdministratorAccess credentials.
  • Region Selection: This lab is designed for us-east-1 (N. Virginia), though it can be adapted for any region.

Learning Objectives

  • Establish a Configuration Recorder using AWS Config to monitor resource state.
  • Implement Managed Config Rules to identify non-compliant resources.
  • Develop an SSM Automation Document workflow for automated remediation.
  • Verify the "Self-Healing" cycle by intentionally creating non-compliant resources.

Architecture Overview

This architecture demonstrates a closed-loop system where detection automatically triggers correction.

Loading Diagram...

Step-by-Step Instructions

Step 1: Create an S3 Bucket for AWS Config

AWS Config requires an S3 bucket to store configuration history files.

bash
# Generate a unique bucket name BUCKET_NAME=brainybee-config-$(date +%s) aws s3 mb s3://$BUCKET_NAME
Console alternative

Navigate to

S3
Create bucket

. Enter a unique name and keep default settings. Click

Create

.

Step 2: Initialize the AWS Config Recorder

You must enable the recorder to start tracking resource changes.

bash
# Check if a recorder already exists aws configservice describe-configuration-recorders # If none exists, create one (requires an IAM role, simplified here for brevity) # Note: In a production environment, use a specific service-linked role.
Console alternative

Navigate to

AWS Config
Settings

. Click

Turn on

. Choose to record all resources in the region and select the S3 bucket created in Step 1.

Step 3: Deploy a Non-Compliant EC2 Instance

Create a small instance without any tags to serve as our "problem" resource.

bash
aws ec2 run-instances \ --image-id ami-0c101f26f147fa7fd \ --count 1 \ --instance-type t2.micro \ --region us-east-1

[!TIP] The AMI ID above is for Amazon Linux 2023 in us-east-1. If you are in a different region, find the local AMI ID first.

Step 4: Create the Config Rule and Remediation

We will use the managed rule required-tags and link it to the SSM document AWS-PublishSendMessageToTopic or a custom tagging script. For this lab, we will target the CostCenter tag.

bash
# Create the Config Rule aws configservice put-config-rule \ --config-rule '{"ConfigRuleName": "check-ec2-tags", "Source": {"Owner": "AWS", "SourceIdentifier": "REQUIRED_TAGS"}, "InputParameters": "{\"tag1Key\":\"CostCenter\"}"}'

Step 5: Configure Automated Remediation

We will link the rule to the SSM document AWS-CreateTags.

Console Instructions (Recommended for visual linking)
  1. Go to
Config
Rules
check-ec2-tags

. 2. Click

Actions
Manage Remediation

. 3. Choose

Automatic Remediation

. 4. Remediation Action:

AWS-CreateTags

. 5. Resource ID Parameter:

ResourceId

. 6. Parameters:

Tag1Key: CostCenter

,

Tag1Value: Lab-Auto-Fixed

. 7. Click

Save

.

Checkpoints

  1. Compliance Status: In the Config Console, check if check-ec2-tags shows "Non-compliant" for your new instance.
  2. Remediation Execution: Navigate to Systems Manager > Automation. You should see an execution for AWS-CreateTags that succeeded.
  3. Final Verification: Run the following CLI command. You should see the CostCenter tag attached to your instance.
bash
aws ec2 describe-tags --filters "Name=resource-id,Values=<YOUR_INSTANCE_ID>"

Concept Review

FeatureDescriptionOperational Excellence Value
AWS ConfigContinuous monitoring and assessment service.Provides visibility and audit trails for "Check" phase.
SSM AutomationExecutes common maintenance and deployment tasks.Reduces human error through "Operations as Code."
RemediationAutomatic trigger of actions based on compliance drift.Minimizes Mean Time to Repair (MTTR).
Compiling TikZ diagram…
Running TeX engine…
This may take a few seconds

Troubleshooting

IssueLikely CauseSolution
Rule stays "Compliant"Recorder is not active.Ensure Configuration Recorder is status ON.
Remediation FailsMissing IAM Permissions.Ensure the SSM service role has ec2:CreateTags permissions.
Resource not foundRegional mismatch.Ensure Config, SSM, and EC2 are all in the same region.

Stretch Challenge

Instead of just tagging the instance, modify the remediation to Stop any EC2 instance that is launched without a Department tag. Use the SSM document AWS-StopEC2Instance. This enforces strict governance policies common in enterprise environments.

Cost Estimate

[!IMPORTANT] Remember to run the teardown commands to avoid ongoing charges.

  • AWS Config: $0.003 per configuration item recorded. Estimated: <$0.05.
  • EC2: t2.micro is Free Tier eligible. Non-free: ~$0.0116/hr. Estimated: <$0.02.
  • S3: Negligible for this volume of data. Estimated: $0.00.
  • Total: Approximately $0.10 USD for 30 minutes of lab time.

Clean-Up / Teardown

  1. Terminate the EC2 Instance:
    bash
    aws ec2 terminate-instances --instance-ids <YOUR_INSTANCE_ID>
  2. Delete the Config Rule:
    bash
    aws configservice delete-config-rule --config-rule-name check-ec2-tags
  3. Delete the S3 Bucket:
    bash
    aws s3 rb s3://$BUCKET_NAME --force
  4. Stop the Configuration Recorder (Optional): If you won't use Config again soon, stop the recorder to avoid costs from other resource changes in your account.

[!WARNING] Failure to terminate the EC2 instance may result in hourly charges if you have exceeded your Free Tier limit.

Ready to study AWS Certified Solutions Architect - Professional (SAP-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free