Implementing Custom Metrics with Amazon CloudWatch Embedded Metric Format (EMF)
Implement custom metrics (for example, Amazon CloudWatch embedded metric format [EMF])
Implementing Custom Metrics with Amazon CloudWatch Embedded Metric Format (EMF)
Learning Objectives
After studying this guide, you will be able to:
- Explain the advantages of using Embedded Metric Format (EMF) over traditional
PutMetricDataAPI calls. - Construct a valid EMF JSON structure for custom metrics.
- Implement EMF in AWS Lambda and other compute environments using AWS SDKs and libraries.
- Troubleshoot metric extraction issues within CloudWatch Logs.
Key Terms & Glossary
- EMF (Embedded Metric Format): A JSON specification used to instruct CloudWatch Logs to automatically extract custom metrics from log streams.
- High Cardinality: A condition where a dataset has many unique values (e.g., UserIDs). EMF is ideal for handling high-cardinality data efficiently.
- Namespace: A container for CloudWatch metrics. Metrics in different namespaces are isolated from each other.
- Dimensions: Name-value pairs that are part of the identity of a metric (e.g.,
InstanceIdorRegion). - Synchronous vs. Asynchronous: Traditional
PutMetricDatais synchronous (blocks execution); EMF is asynchronous (non-blocking log emission).
The "Big Idea"
In high-performance applications, specifically AWS Lambda, calling the CloudWatch PutMetricData API for every request introduces significant latency and cost due to HTTP overhead and API throttling limits. The Embedded Metric Format (EMF) flips this pattern: you simply print a specially formatted JSON object to your standard output (logs). CloudWatch Logs then parses these logs in the background to generate your metrics. This provides the best of both worlds: detailed logs for debugging and real-time metrics for monitoring, all with zero impact on application latency.
Formula / Concept Box
The EMF JSON Structure
An EMF log entry must contain a _aws root node with the following schema:
| Field | Description | Requirement |
|---|---|---|
Timestamp | Epoch milliseconds | Required |
CloudWatchMetrics | Array of metric definitions | Required |
Namespace | The custom namespace string | Required |
Dimensions | 2D array of dimension names | Required |
Metrics | Array of objects (Name, Unit) | Required |
[!IMPORTANT] Every field listed in the
DimensionsandMetricsarrays must also exist as a top-level property in the JSON object itself.
Hierarchical Outline
- Traditional Monitoring vs. EMF
- PutMetricData Limitations: API throttling, network latency, individual API costs.
- EMF Benefits: Asynchronous execution, lower cost (log-based), high cardinality support.
- The EMF Specification
- Metadata Block (
_aws): Defines how CloudWatch should interpret the data. - Target Members: The actual values (properties) stored alongside the metadata.
- Metadata Block (
- Implementation Patterns
- AWS Lambda: Native support; simply
console.logor use theaws-embedded-metricslibrary. - EC2 / ECS / On-Premise: Requires the CloudWatch Agent to be installed and configured to intercept and parse EMF logs.
- AWS Lambda: Native support; simply
- Best Practices
- Unit Selection: Always specify standard units (Seconds, Bytes, Count, etc.) to enable proper graphing.
- Dimension Limits: Limit the number of dimension sets to avoid excessive unique metric combinations.
Visual Anchors
Metric Ingestion Pipeline
Conceptual Relationship: Logs to Metrics
\begin{tikzpicture}[node distance=2cm] \draw[thick, fill=blue!10] (0,0) rectangle (6,3) node[midway, yshift=1.2cm] {\textbf{Log Event (JSON)}}; \node at (3,1.5) [draw, dashed, inner sep=5pt] (spec) {\tiny _aws: { Namespace: "App", Metrics: ["Latency"] }}; \node at (3,0.5) { "Latency": 150, "User": "Alice" };
\draw[->, ultra thick, gray] (6.5,1.5) -- (8.5,1.5) node[midway, above] {\small Extraction};
\draw[thick, fill=green!10] (9,0.5) rectangle (12,2.5) node[midway] {\textbf{Metric Data}}; \node at (10.5, 0) {\tiny Namespace: App, Value: 150}; \end{tikzpicture}
Definition-Example Pairs
- Metric Directive: The part of the EMF JSON that tells CloudWatch which keys are metrics.
- Example: If your JSON is
{"Latency": 45}, your directive definesLatencyas a metric so it appears on a graph.
- Example: If your JSON is
- Unit: The measurement type associated with the metric.
- Example: Using
Millisecondsfor aResponseTimemetric allows CloudWatch to perform correct mathematical aggregations like P99.
- Example: Using
- Dimension Set: A collection of attributes used to filter metrics.
- Example: Storing
ServiceandStage(Prod/Dev) as dimensions allows you to view metrics for the whole service or just one environment.
- Example: Storing
Worked Examples
Example 1: Basic Node.js Implementation (Manual JSON)
If you don't want to use a heavy library, you can manually construct the EMF string in a Lambda function.
exports.handler = async (event) => {
const emfPayload = {
"_aws": {
"Timestamp": Date.now(),
"CloudWatchMetrics": [{
"Namespace": "OrderProcessor",
"Dimensions": [["Region", "OrderType"]],
"Metrics": [{ "Name": "ProcessingTime", "Unit": "Milliseconds" }]
}]
},
"Region": "us-east-1",
"OrderType": "Retail",
"ProcessingTime": 142,
"OrderId": "ORD-9921" // High cardinality, not a dimension
};
console.log(JSON.stringify(emfPayload));
return { statusCode: 200 };
};Example 2: Using the AWS EMF Library (Recommended)
The library handles the boilerplate and ensures the JSON structure is always valid.
const { metricScope } = require("aws-embedded-metrics");
const myHandler = metricScope(metrics => async (event) => {
metrics.setNamespace("VideoService");
metrics.putDimensions({ Service: "Transcoder" });
const startTime = Date.now();
// ... perform logic ...
const duration = Date.now() - startTime;
metrics.putMetric("ProcessingDuration", duration, "Milliseconds");
metrics.setProperty("VideoId", "12345"); // Metadata, not a dimension
});Checkpoint Questions
- What is the primary performance benefit of EMF over the
PutMetricDataAPI call in an AWS Lambda function? - In an EMF JSON object, where must the actual values (e.g., the numeric measurement) be located?
- True or False: Every property included in the EMF JSON will be charged as a CloudWatch Custom Metric.
- If you are running an application on an EC2 instance, what extra component is required to process EMF logs?
- Why is it beneficial to include high-cardinality data (like
RequestID) as a property but not as a dimension?
▶Click to see answers
- EMF is non-blocking (asynchronous) because it relies on standard logging, eliminating the network latency of a synchronous API call.
- They must be top-level properties of the JSON object.
- False. Only the fields defined in the
Metricsarray within the_awsblock are processed as metrics. - The CloudWatch Agent.
- Properties allow you to search/filter in CloudWatch Logs Insights, while Dimensions create unique metric time series. High-cardinality dimensions lead to "Metric Explosion" and high costs.