AWS Lab: Implementing Reliable Architectures with Auto Scaling and Load Balancing

This lab focuses on Task 3.4: Determining a strategy to improve reliability from the AWS SAP-C02 curriculum. You will modernize a single-instance workload into a multi-AZ, self-healing architecture using the AWS Well-Architected Framework's reliability principles.

Prerequisites

An AWS Account with permissions for EC2, Auto Scaling, and ELB.
AWS CLI installed and configured (aws configure).
Basic familiarity with VPC concepts (subnets, security groups).
A default VPC in your region (usually us-east-1 or us-west-2).

Learning Objectives

Design for Failure: Implement a multi-AZ architecture to eliminate single points of failure.
Horizontal Scaling: Configure an Auto Scaling Group (ASG) to manage capacity automatically.
Automated Recovery: Use Health Checks to trigger the replacement of unhealthy instances.
Traffic Distribution: Deploy an Application Load Balancer (ALB) to route traffic across healthy targets.

Architecture Overview

We are moving from a fragile "Single Instance" model to a resilient "Load Balanced ASG" model.

Loading Diagram...

Step-by-Step Instructions

Step 1: Create a Resilient Security Group

This group will allow HTTP traffic to our web servers.

bash

# Create the Security Group
aws ec2 create-security-group \
    --group-name "brainybee-lab-sg" \
    --description "Allow HTTP traffic for reliability lab" \
    --vpc-id <YOUR_VPC_ID>

# Authorize Port 80 access
aws ec2 authorize-security-group-ingress \
    --group-name "brainybee-lab-sg" \
    --protocol tcp \
    --port 80 \
    --cidr 0.0.0.0/0

▶Console alternative

Navigate to EC2 > Security Groups > Create security group. Add an Inbound Rule for HTTP (80) from Anywhere (0.0.0.0/0).

Step 2: Define a Launch Template

The Launch Template ensures that every new instance is configured identically, supporting the reliability principle of "Manage change through automation."

bash

# Create a launch template with a simple web server userdata
aws ec2 create-launch-template \
    --launch-template-name "ReliabilityTemplate" \
    --version-description "v1" \
    --launch-template-data '{"NetworkInterfaces":[{"DeviceIndex":0,"Groups":["<YOUR_SG_ID>"],"AssociatePublicIpAddress":true}],"ImageId":"ami-0c55b159cbfafe1f0","InstanceType":"t2.micro","UserData":"IyEvYmluL2Jhc2gKeXVtIGluc3RhbGwgLXkgaHR0cGQKc3lzdGVtY3RsIHN0YXJ0IGh0dHBkCnN5c3RlbWN0bCBlbmFibGUgaHR0cGQKZWNobyBcIkhlbGxvIGZyb20gJChob3N0bmFtZSAtZilcIiA+IC92YXIvd3d3L2h0bWwvaW5kZXguaHRtbA=="}'

[!NOTE] The UserData is Base64 encoded. It installs Apache and creates a page showing the instance's hostname.

Step 3: Deploy the Application Load Balancer (ALB)

The ALB is critical for "Scaling horizontally to increase aggregate workload availability."

bash

# 1. Create Target Group
aws elbv2 create-target-group \
    --name "lab-tg" \
    --protocol HTTP \
    --port 80 \
    --vpc-id <YOUR_VPC_ID> \
    --health-check-path "/"

# 2. Create Load Balancer (select at least two subnets in different AZs)
aws elbv2 create-load-balancer \
    --name "lab-alb" \
    --subnets <SUBNET_AZ_A> <SUBNET_AZ_B> \
    --security-groups <YOUR_SG_ID>

Step 4: Create the Auto Scaling Group (ASG)

This step implements "Stop guessing capacity" and "Automatically recover from failure."

bash

aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name "reliability-asg" \
    --launch-template "LaunchTemplateName=ReliabilityTemplate" \
    --min-size 2 \
    --max-size 4 \
    --desired-capacity 2 \
    --vpc-zone-identifier "<SUBNET_AZ_A>,<SUBNET_AZ_B>" \
    --target-group-arns "<YOUR_TG_ARN>"

Checkpoints

Traffic Verification: Copy the DNS Name of your ALB from the console or aws elbv2 describe-load-balancers. Paste it into your browser. Refresh several times; you should see the hostname change as the ALB toggles between instances.
Failure Simulation:
- Go to the EC2 console and Terminate one of the running instances.
- Observe the ASG Activity tab. It should detect the "unhealthy" state and automatically launch a replacement.

Visualizing Health Checks

This TikZ diagram illustrates the interaction between the Load Balancer and the Target Group's health check mechanism.

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Troubleshooting

Issue	Possible Cause	Fix
ALB returns 503	No healthy targets	Check if UserData script failed; check SG rules between ALB and EC2.
ASG not launching	Service Quota reached	Check `REL1` in Well-Architected. Increase EC2 instance limits.
Health check fails	Port mismatch	Ensure Target Group is checking port 80 and Apache is running.

Challenge

Implement a Dynamic Scaling Policy: Currently, the ASG stays at 2 instances. Modify the ASG to add an extra instance whenever the average CPU utilization exceeds 70% using a Target Tracking Policy.

▶Hint

Use aws autoscaling put-scaling-policy with the --policy-type TargetTrackingScaling parameter.

Cost Estimate

Service	Configuration	Estimated Cost (Monthly)
ALB	1 Load Balancer + LCU usage	~$16.00
EC2	2 x t2.micro (on-demand)	~$18.00 (Free Tier eligible)
Data Transfer	Intra-AZ	Free
Total		~$34.00 (or $0 if Free Tier eligible)

Teardown

[!WARNING] To avoid ongoing charges, you must delete these resources in order.

bash

# 1. Delete ASG (this terminates the instances)
aws autoscaling delete-auto-scaling-group --auto-scaling-group-name "reliability-asg" --force-delete

# 2. Delete ALB
aws elbv2 delete-load-balancer --load-balancer-arn <YOUR_ALB_ARN>

# 3. Delete Target Group
aws elbv2 delete-target-group --target-group-arn <YOUR_TG_ARN>

# 4. Delete Launch Template
aws ec2 delete-launch-template --launch-template-name "ReliabilityTemplate"

Concept Review

Reliability Principle	How we applied it in this lab
Test Recovery Procedures	We manually terminated an instance to ensure the ASG replaced it.
Scale Horizontally	We used an ALB to distribute traffic across multiple EC2 instances.
Stop Guessing Capacity	We defined Min/Max/Desired settings for the ASG.
Manage Change via Automation	We used a Launch Template to ensure consistent environment deployment.

AWS Lab: Implementing Reliable Architectures with Auto Scaling and Load Balancing

Prerequisites

An AWS Account with permissions for EC2, Auto Scaling, and ELB.
AWS CLI installed and configured (aws configure).
Basic familiarity with VPC concepts (subnets, security groups).
A default VPC in your region (usually us-east-1 or us-west-2).

Learning Objectives

Design for Failure: Implement a multi-AZ architecture to eliminate single points of failure.
Horizontal Scaling: Configure an Auto Scaling Group (ASG) to manage capacity automatically.
Automated Recovery: Use Health Checks to trigger the replacement of unhealthy instances.
Traffic Distribution: Deploy an Application Load Balancer (ALB) to route traffic across healthy targets.

Architecture Overview

We are moving from a fragile "Single Instance" model to a resilient "Load Balanced ASG" model.

Loading Diagram...

Step-by-Step Instructions

Step 1: Create a Resilient Security Group

This group will allow HTTP traffic to our web servers.

bash

# Create the Security Group
aws ec2 create-security-group \
    --group-name "brainybee-lab-sg" \
    --description "Allow HTTP traffic for reliability lab" \
    --vpc-id <YOUR_VPC_ID>

# Authorize Port 80 access
aws ec2 authorize-security-group-ingress \
    --group-name "brainybee-lab-sg" \
    --protocol tcp \
    --port 80 \
    --cidr 0.0.0.0/0

▶Console alternative

Navigate to EC2 > Security Groups > Create security group. Add an Inbound Rule for HTTP (80) from Anywhere (0.0.0.0/0).

Step 2: Define a Launch Template

The Launch Template ensures that every new instance is configured identically, supporting the reliability principle of "Manage change through automation."

bash

# Create a launch template with a simple web server userdata
aws ec2 create-launch-template \
    --launch-template-name "ReliabilityTemplate" \
    --version-description "v1" \
    --launch-template-data '{"NetworkInterfaces":[{"DeviceIndex":0,"Groups":["<YOUR_SG_ID>"],"AssociatePublicIpAddress":true}],"ImageId":"ami-0c55b159cbfafe1f0","InstanceType":"t2.micro","UserData":"IyEvYmluL2Jhc2gKeXVtIGluc3RhbGwgLXkgaHR0cGQKc3lzdGVtY3RsIHN0YXJ0IGh0dHBkCnN5c3RlbWN0bCBlbmFibGUgaHR0cGQKZWNobyBcIkhlbGxvIGZyb20gJChob3N0bmFtZSAtZilcIiA+IC92YXIvd3d3L2h0bWwvaW5kZXguaHRtbA=="}'

[!NOTE] The UserData is Base64 encoded. It installs Apache and creates a page showing the instance's hostname.

Step 3: Deploy the Application Load Balancer (ALB)

The ALB is critical for "Scaling horizontally to increase aggregate workload availability."

bash

# 1. Create Target Group
aws elbv2 create-target-group \
    --name "lab-tg" \
    --protocol HTTP \
    --port 80 \
    --vpc-id <YOUR_VPC_ID> \
    --health-check-path "/"

# 2. Create Load Balancer (select at least two subnets in different AZs)
aws elbv2 create-load-balancer \
    --name "lab-alb" \
    --subnets <SUBNET_AZ_A> <SUBNET_AZ_B> \
    --security-groups <YOUR_SG_ID>

Step 4: Create the Auto Scaling Group (ASG)

This step implements "Stop guessing capacity" and "Automatically recover from failure."

bash

aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name "reliability-asg" \
    --launch-template "LaunchTemplateName=ReliabilityTemplate" \
    --min-size 2 \
    --max-size 4 \
    --desired-capacity 2 \
    --vpc-zone-identifier "<SUBNET_AZ_A>,<SUBNET_AZ_B>" \
    --target-group-arns "<YOUR_TG_ARN>"

Checkpoints

Traffic Verification: Copy the DNS Name of your ALB from the console or aws elbv2 describe-load-balancers. Paste it into your browser. Refresh several times; you should see the hostname change as the ALB toggles between instances.
Failure Simulation:
- Go to the EC2 console and Terminate one of the running instances.
- Observe the ASG Activity tab. It should detect the "unhealthy" state and automatically launch a replacement.

Visualizing Health Checks

This TikZ diagram illustrates the interaction between the Load Balancer and the Target Group's health check mechanism.

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Troubleshooting

Issue	Possible Cause	Fix
ALB returns 503	No healthy targets	Check if UserData script failed; check SG rules between ALB and EC2.
ASG not launching	Service Quota reached	Check `REL1` in Well-Architected. Increase EC2 instance limits.
Health check fails	Port mismatch	Ensure Target Group is checking port 80 and Apache is running.

Challenge

▶Hint

Use aws autoscaling put-scaling-policy with the --policy-type TargetTrackingScaling parameter.

Cost Estimate

Service	Configuration	Estimated Cost (Monthly)
ALB	1 Load Balancer + LCU usage	~$16.00
EC2	2 x t2.micro (on-demand)	~$18.00 (Free Tier eligible)
Data Transfer	Intra-AZ	Free
Total		~$34.00 (or $0 if Free Tier eligible)

Teardown

[!WARNING] To avoid ongoing charges, you must delete these resources in order.

bash

# 1. Delete ASG (this terminates the instances)
aws autoscaling delete-auto-scaling-group --auto-scaling-group-name "reliability-asg" --force-delete

# 2. Delete ALB
aws elbv2 delete-load-balancer --load-balancer-arn <YOUR_ALB_ARN>

# 3. Delete Target Group
aws elbv2 delete-target-group --target-group-arn <YOUR_TG_ARN>

# 4. Delete Launch Template
aws ec2 delete-launch-template --launch-template-name "ReliabilityTemplate"

Concept Review

Reliability Principle	How we applied it in this lab
Test Recovery Procedures	We manually terminated an instance to ensure the ASG replaced it.
Scale Horizontally	We used an ALB to distribute traffic across multiple EC2 instances.
Stop Guessing Capacity	We defined Min/Max/Desired settings for the ASG.
Manage Change via Automation	We used a Launch Template to ensure consistent environment deployment.