AWS Lab: Implementing Reliable Architectures with Auto Scaling and Load Balancing
Determine a strategy to improve reliability
AWS Lab: Implementing Reliable Architectures with Auto Scaling and Load Balancing
This lab focuses on Task 3.4: Determining a strategy to improve reliability from the AWS SAP-C02 curriculum. You will modernize a single-instance workload into a multi-AZ, self-healing architecture using the AWS Well-Architected Framework's reliability principles.
Prerequisites
- An AWS Account with permissions for EC2, Auto Scaling, and ELB.
- AWS CLI installed and configured (
aws configure). - Basic familiarity with VPC concepts (subnets, security groups).
- A default VPC in your region (usually
us-east-1orus-west-2).
Learning Objectives
- Design for Failure: Implement a multi-AZ architecture to eliminate single points of failure.
- Horizontal Scaling: Configure an Auto Scaling Group (ASG) to manage capacity automatically.
- Automated Recovery: Use Health Checks to trigger the replacement of unhealthy instances.
- Traffic Distribution: Deploy an Application Load Balancer (ALB) to route traffic across healthy targets.
Architecture Overview
We are moving from a fragile "Single Instance" model to a resilient "Load Balanced ASG" model.
Step-by-Step Instructions
Step 1: Create a Resilient Security Group
This group will allow HTTP traffic to our web servers.
# Create the Security Group
aws ec2 create-security-group \
--group-name "brainybee-lab-sg" \
--description "Allow HTTP traffic for reliability lab" \
--vpc-id <YOUR_VPC_ID>
# Authorize Port 80 access
aws ec2 authorize-security-group-ingress \
--group-name "brainybee-lab-sg" \
--protocol tcp \
--port 80 \
--cidr 0.0.0.0/0▶Console alternative
Navigate to EC2 > Security Groups > Create security group. Add an Inbound Rule for HTTP (80) from Anywhere (0.0.0.0/0).
Step 2: Define a Launch Template
The Launch Template ensures that every new instance is configured identically, supporting the reliability principle of "Manage change through automation."
# Create a launch template with a simple web server userdata
aws ec2 create-launch-template \
--launch-template-name "ReliabilityTemplate" \
--version-description "v1" \
--launch-template-data '{"NetworkInterfaces":[{"DeviceIndex":0,"Groups":["<YOUR_SG_ID>"],"AssociatePublicIpAddress":true}],"ImageId":"ami-0c55b159cbfafe1f0","InstanceType":"t2.micro","UserData":"IyEvYmluL2Jhc2gKeXVtIGluc3RhbGwgLXkgaHR0cGQKc3lzdGVtY3RsIHN0YXJ0IGh0dHBkCnN5c3RlbWN0bCBlbmFibGUgaHR0cGQKZWNobyBcIkhlbGxvIGZyb20gJChob3N0bmFtZSAtZilcIiA+IC92YXIvd3d3L2h0bWwvaW5kZXguaHRtbA=="}'[!NOTE] The UserData is Base64 encoded. It installs Apache and creates a page showing the instance's hostname.
Step 3: Deploy the Application Load Balancer (ALB)
The ALB is critical for "Scaling horizontally to increase aggregate workload availability."
# 1. Create Target Group
aws elbv2 create-target-group \
--name "lab-tg" \
--protocol HTTP \
--port 80 \
--vpc-id <YOUR_VPC_ID> \
--health-check-path "/"
# 2. Create Load Balancer (select at least two subnets in different AZs)
aws elbv2 create-load-balancer \
--name "lab-alb" \
--subnets <SUBNET_AZ_A> <SUBNET_AZ_B> \
--security-groups <YOUR_SG_ID>Step 4: Create the Auto Scaling Group (ASG)
This step implements "Stop guessing capacity" and "Automatically recover from failure."
aws autoscaling create-auto-scaling-group \
--auto-scaling-group-name "reliability-asg" \
--launch-template "LaunchTemplateName=ReliabilityTemplate" \
--min-size 2 \
--max-size 4 \
--desired-capacity 2 \
--vpc-zone-identifier "<SUBNET_AZ_A>,<SUBNET_AZ_B>" \
--target-group-arns "<YOUR_TG_ARN>"Checkpoints
- Traffic Verification: Copy the DNS Name of your ALB from the console or
aws elbv2 describe-load-balancers. Paste it into your browser. Refresh several times; you should see the hostname change as the ALB toggles between instances. - Failure Simulation:
- Go to the EC2 console and Terminate one of the running instances.
- Observe the ASG Activity tab. It should detect the "unhealthy" state and automatically launch a replacement.
Visualizing Health Checks
This TikZ diagram illustrates the interaction between the Load Balancer and the Target Group's health check mechanism.
\begin{tikzpicture}[node distance=2cm, every node/.style={fill=white, font=\footnotesize}, scale=0.8] \draw[fill=gray!10] (-1,-1) rectangle (8,3); \node (ALB) at (0,1) [draw, rectangle, minimum width=1.5cm] {ALB}; \node (TG) at (3.5,1) [draw, rounded corners] {Target Group}; \node (EC2) at (7,1) [draw, circle] {EC2};
\draw[->, thick] (ALB) -- (TG) node[midway, above] {Route};
\draw[->, thick] (TG) -- (EC2) node[midway, above] {Request};
\draw[->, blue, dashed] (TG) to [bend left=45] node[midway, above] {HTTP GET /} (EC2);
\draw[<-, green!60!black, dashed] (TG) to [bend right=45] node[midway, below] {200 OK} (EC2);
\node at (3.5, -0.5) {\textbf{Health Check Loop}};\end{tikzpicture}
Troubleshooting
| Issue | Possible Cause | Fix |
|---|---|---|
| ALB returns 503 | No healthy targets | Check if UserData script failed; check SG rules between ALB and EC2. |
| ASG not launching | Service Quota reached | Check REL1 in Well-Architected. Increase EC2 instance limits. |
| Health check fails | Port mismatch | Ensure Target Group is checking port 80 and Apache is running. |
Challenge
Implement a Dynamic Scaling Policy: Currently, the ASG stays at 2 instances. Modify the ASG to add an extra instance whenever the average CPU utilization exceeds 70% using a Target Tracking Policy.
▶Hint
Use aws autoscaling put-scaling-policy with the --policy-type TargetTrackingScaling parameter.
Cost Estimate
| Service | Configuration | Estimated Cost (Monthly) |
|---|---|---|
| ALB | 1 Load Balancer + LCU usage | ~$16.00 |
| EC2 | 2 x t2.micro (on-demand) | ~$18.00 (Free Tier eligible) |
| Data Transfer | Intra-AZ | Free |
| Total | ~$34.00 (or $0 if Free Tier eligible) |
Teardown
[!WARNING] To avoid ongoing charges, you must delete these resources in order.
# 1. Delete ASG (this terminates the instances)
aws autoscaling delete-auto-scaling-group --auto-scaling-group-name "reliability-asg" --force-delete
# 2. Delete ALB
aws elbv2 delete-load-balancer --load-balancer-arn <YOUR_ALB_ARN>
# 3. Delete Target Group
aws elbv2 delete-target-group --target-group-arn <YOUR_TG_ARN>
# 4. Delete Launch Template
aws ec2 delete-launch-template --launch-template-name "ReliabilityTemplate"Concept Review
| Reliability Principle | How we applied it in this lab |
|---|---|
| Test Recovery Procedures | We manually terminated an instance to ensure the ASG replaced it. |
| Scale Horizontally | We used an ALB to distribute traffic across multiple EC2 instances. |
| Stop Guessing Capacity | We defined Min/Max/Desired settings for the ASG. |
| Manage Change via Automation | We used a Launch Template to ensure consistent environment deployment. |