Study Guide875 words

Implementing Custom Metrics with Amazon CloudWatch Embedded Metric Format (EMF)

Implement custom metrics (for example, Amazon CloudWatch embedded metric format [EMF])

Implementing Custom Metrics with Amazon CloudWatch Embedded Metric Format (EMF)

Learning Objectives

After studying this guide, you will be able to:

  • Explain the advantages of using Embedded Metric Format (EMF) over traditional PutMetricData API calls.
  • Construct a valid EMF JSON structure for custom metrics.
  • Implement EMF in AWS Lambda and other compute environments using AWS SDKs and libraries.
  • Troubleshoot metric extraction issues within CloudWatch Logs.

Key Terms & Glossary

  • EMF (Embedded Metric Format): A JSON specification used to instruct CloudWatch Logs to automatically extract custom metrics from log streams.
  • High Cardinality: A condition where a dataset has many unique values (e.g., UserIDs). EMF is ideal for handling high-cardinality data efficiently.
  • Namespace: A container for CloudWatch metrics. Metrics in different namespaces are isolated from each other.
  • Dimensions: Name-value pairs that are part of the identity of a metric (e.g., InstanceId or Region).
  • Synchronous vs. Asynchronous: Traditional PutMetricData is synchronous (blocks execution); EMF is asynchronous (non-blocking log emission).

The "Big Idea"

In high-performance applications, specifically AWS Lambda, calling the CloudWatch PutMetricData API for every request introduces significant latency and cost due to HTTP overhead and API throttling limits. The Embedded Metric Format (EMF) flips this pattern: you simply print a specially formatted JSON object to your standard output (logs). CloudWatch Logs then parses these logs in the background to generate your metrics. This provides the best of both worlds: detailed logs for debugging and real-time metrics for monitoring, all with zero impact on application latency.

Formula / Concept Box

The EMF JSON Structure

An EMF log entry must contain a _aws root node with the following schema:

FieldDescriptionRequirement
TimestampEpoch millisecondsRequired
CloudWatchMetricsArray of metric definitionsRequired
NamespaceThe custom namespace stringRequired
Dimensions2D array of dimension namesRequired
MetricsArray of objects (Name, Unit)Required

[!IMPORTANT] Every field listed in the Dimensions and Metrics arrays must also exist as a top-level property in the JSON object itself.

Hierarchical Outline

  1. Traditional Monitoring vs. EMF
    • PutMetricData Limitations: API throttling, network latency, individual API costs.
    • EMF Benefits: Asynchronous execution, lower cost (log-based), high cardinality support.
  2. The EMF Specification
    • Metadata Block (_aws): Defines how CloudWatch should interpret the data.
    • Target Members: The actual values (properties) stored alongside the metadata.
  3. Implementation Patterns
    • AWS Lambda: Native support; simply console.log or use the aws-embedded-metrics library.
    • EC2 / ECS / On-Premise: Requires the CloudWatch Agent to be installed and configured to intercept and parse EMF logs.
  4. Best Practices
    • Unit Selection: Always specify standard units (Seconds, Bytes, Count, etc.) to enable proper graphing.
    • Dimension Limits: Limit the number of dimension sets to avoid excessive unique metric combinations.

Visual Anchors

Metric Ingestion Pipeline

Loading Diagram...

Conceptual Relationship: Logs to Metrics

\begin{tikzpicture}[node distance=2cm] \draw[thick, fill=blue!10] (0,0) rectangle (6,3) node[midway, yshift=1.2cm] {\textbf{Log Event (JSON)}}; \node at (3,1.5) [draw, dashed, inner sep=5pt] (spec) {\tiny _aws: { Namespace: "App", Metrics: ["Latency"] }}; \node at (3,0.5) { "Latency": 150, "User": "Alice" };

\draw[->, ultra thick, gray] (6.5,1.5) -- (8.5,1.5) node[midway, above] {\small Extraction};

\draw[thick, fill=green!10] (9,0.5) rectangle (12,2.5) node[midway] {\textbf{Metric Data}}; \node at (10.5, 0) {\tiny Namespace: App, Value: 150}; \end{tikzpicture}

Definition-Example Pairs

  • Metric Directive: The part of the EMF JSON that tells CloudWatch which keys are metrics.
    • Example: If your JSON is {"Latency": 45}, your directive defines Latency as a metric so it appears on a graph.
  • Unit: The measurement type associated with the metric.
    • Example: Using Milliseconds for a ResponseTime metric allows CloudWatch to perform correct mathematical aggregations like P99.
  • Dimension Set: A collection of attributes used to filter metrics.
    • Example: Storing Service and Stage (Prod/Dev) as dimensions allows you to view metrics for the whole service or just one environment.

Worked Examples

Example 1: Basic Node.js Implementation (Manual JSON)

If you don't want to use a heavy library, you can manually construct the EMF string in a Lambda function.

javascript
exports.handler = async (event) => { const emfPayload = { "_aws": { "Timestamp": Date.now(), "CloudWatchMetrics": [{ "Namespace": "OrderProcessor", "Dimensions": [["Region", "OrderType"]], "Metrics": [{ "Name": "ProcessingTime", "Unit": "Milliseconds" }] }] }, "Region": "us-east-1", "OrderType": "Retail", "ProcessingTime": 142, "OrderId": "ORD-9921" // High cardinality, not a dimension }; console.log(JSON.stringify(emfPayload)); return { statusCode: 200 }; };

The library handles the boilerplate and ensures the JSON structure is always valid.

javascript
const { metricScope } = require("aws-embedded-metrics"); const myHandler = metricScope(metrics => async (event) => { metrics.setNamespace("VideoService"); metrics.putDimensions({ Service: "Transcoder" }); const startTime = Date.now(); // ... perform logic ... const duration = Date.now() - startTime; metrics.putMetric("ProcessingDuration", duration, "Milliseconds"); metrics.setProperty("VideoId", "12345"); // Metadata, not a dimension });

Checkpoint Questions

  1. What is the primary performance benefit of EMF over the PutMetricData API call in an AWS Lambda function?
  2. In an EMF JSON object, where must the actual values (e.g., the numeric measurement) be located?
  3. True or False: Every property included in the EMF JSON will be charged as a CloudWatch Custom Metric.
  4. If you are running an application on an EC2 instance, what extra component is required to process EMF logs?
  5. Why is it beneficial to include high-cardinality data (like RequestID) as a property but not as a dimension?
Click to see answers
  1. EMF is non-blocking (asynchronous) because it relies on standard logging, eliminating the network latency of a synchronous API call.
  2. They must be top-level properties of the JSON object.
  3. False. Only the fields defined in the Metrics array within the _aws block are processed as metrics.
  4. The CloudWatch Agent.
  5. Properties allow you to search/filter in CloudWatch Logs Insights, while Dimensions create unique metric time series. High-cardinality dimensions lead to "Metric Explosion" and high costs.

Ready to study AWS Certified Developer - Associate (DVA-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free