Study Guide985 words

Mastering Application Health Checks and Readiness Probes

Configure application health checks and readiness probes

Mastering Application Health Checks and Readiness Probes

[!IMPORTANT] For the AWS Certified Developer - Associate (DVA-C02) exam, understanding health checks is critical for ensuring high availability and automating the replacement of failing application components.

Learning Objectives

By the end of this guide, you will be able to:

  • Differentiate between Liveness and Readiness concepts in AWS environments.
  • Configure Application Load Balancer (ALB) and Network Load Balancer (NLB) target group health checks.
  • Implement a robust health check endpoint within your application code.
  • Define the relationship between Auto Scaling Group (ASG) health checks and Load Balancer health checks.
  • Configure Route 53 health checks for DNS-level failover.

Key Terms & Glossary

  • Health Check: A periodic request sent by an AWS service (like ELB) to an application instance to verify its status.
  • Readiness Probe: A check to determine if an application is ready to accept traffic (e.g., after it has finished loading a cache or connecting to a database).
  • Grace Period: The amount of time an Auto Scaling group waits before checking the health status of a newly launched instance.
  • Healthy/Unhealthy Threshold: The number of consecutive successful or failed probes required to change an instance's status.
  • Deep Health Check: A health check that validates not just the web server, but also downstream dependencies like databases or external APIs.

The "Big Idea"

In a distributed system, individual components will inevitably fail. Health checks act as the "system pulse." Without them, a Load Balancer would blindly send traffic to a "zombie" instance—one that is running but unable to process requests—leading to user-facing errors. By effectively configuring health checks, you enable AWS to automatically route traffic away from failing components and trigger self-healing mechanisms.

Formula / Concept Box

ParameterStandard Default (ALB)Purpose
Path/The destination URL for the health check (e.g., /health).
Porttraffic-portThe port the load balancer uses to send health checks.
Healthy Threshold5Consecutive successes to mark as "Healthy".
Unhealthy Threshold2Consecutive failures to mark as "Unhealthy".
Interval30 secondsTime between individual health check probes.
Timeout5 secondsTime to wait for a response before counting as a failure.
Success Codes200The HTTP status code(s) indicating a healthy response.

Hierarchical Outline

  1. Elastic Load Balancing (ELB) Health Checks
    • Target Groups: Health checks are configured at the Target Group level, not the Load Balancer level.
    • Status Codes: You can specify a range (e.g., 200-399) for flexibility.
  2. Auto Scaling Group (ASG) Integration
    • EC2 Status Checks: Default check; only sees if the VM is up.
    • ELB Health Checks: Must be enabled manually. If the ELB marks an instance unhealthy, the ASG terminates and replaces it.
  3. Application-Level Implementation
    • Shallow Checks: Verifies only the web server is responding (e.g., static file).
    • Deep Checks: Verifies DB connectivity, memory usage, and background thread status.
  4. Route 53 Health Checks
    • Public Endpoints: Monitors public-facing IP addresses or domain names.
    • CloudWatch Alarms: Can trigger DNS failover based on alarm status.

Visual Anchors

Health Check Lifecycle

Loading Diagram...

Load Balancer vs. Target Health

\begin{tikzpicture}[node distance=2cm, every node/.style={draw, rectangle, align=center, minimum width=2.5cm}] \node (User) [draw=none] {User}; \node (ALB) [right of=User, xshift=1cm] {ALB}; \node (H) [right of=ALB, yshift=1cm, xshift=1cm, fill=green!20] {Target A$Healthy)}; \node (U) [right of=ALB, yshift=-1cm, xshift=1cm, fill=red!20] {Target B$Unhealthy)};

code
\draw[->, thick] (User) -- (ALB); \draw[->, thick] (ALB) -- (H) node[midway, above, sloped, draw=none] {Traffic OK}; \draw[->, dashed, red] (ALB) -- (U) node[midway, below, sloped, draw=none] {Blocked}; \draw[<->, blue] (ALB) edge[bend right=45] node[right, draw=none] {Health Probe} (U);

\end{tikzpicture}

Definition-Example Pairs

  • Shallow Health Check: A simple check of the web server availability.
    • Example: Checking if index.html loads. It is fast but doesn't guarantee the app can actually talk to the database.
  • Readiness Probe: A check that ensures the application is fully initialized.
    • Example: A Java Spring Boot application that must wait for its Hibernate connection pool to initialize before it returns a 200 OK on /ready.
  • Health Check Grace Period: A delay before the ASG starts killing instances.
    • Example: If your app takes 5 minutes to start, set the grace period to 300 seconds so the ASG doesn't kill it for being "unhealthy" while it's still booting up.

Worked Examples

Scenario 1: Configuring an ALB Health Check Path

You have a Python Flask application. You need to ensure the ALB only sends traffic if the database connection is alive.

  1. Application Code:
python
@app.route('/health') def health_check(): try: db.session.execute('SELECT 1') return "Healthy", 200 except Exception: return "Service Unavailable", 503
  1. AWS Configuration:
    • Navigate to Target Groups in the EC2 Console.
    • Select your target group -> Health checks tab -> Edit.
    • Health check path: /health.
    • Success codes: 200.

Scenario 2: ASG Integration

You notice that even though your ALB shows instances as "Unhealthy," the Auto Scaling Group is not replacing them.

  • Fix: By default, ASG only uses EC2 status checks (hardware/system level). You must go to the ASG settings and change the Health Check Type to ELB. This allows the ASG to use the ALB's granular application-level health check results to trigger instance replacement.

Checkpoint Questions

  1. What happens to a connection that is already in progress if an instance becomes "Unhealthy"? (Answer: The ALB allows the request to complete—known as Deregistration Delay or Connection Draining—but sends no new requests).
  2. If your application takes 2 minutes to download dependencies during startup, what ASG setting must you adjust? (Answer: Increase the Health Check Grace Period).
  3. True or False: Route 53 can use an alias record to evaluate the health of an ALB. (Answer: True, this is called "Evaluate Target Health").
  4. Can a health check be configured to look for a specific string in the response body? (Answer: No, ELB health checks only look at the HTTP status code).

Ready to study AWS Certified Developer - Associate (DVA-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free