Implementing Custom Metrics in AWS Applications
Implement code that emits custom metrics
Implementing Custom Metrics in AWS Applications
Monitoring application-specific behavior is a core requirement for the AWS Certified Developer - Associate (DVA-C02) exam. While AWS provides default metrics (e.g., CPU utilization, disk I/O), custom metrics allow developers to track business-level data like 'Items Sold', 'Cache Misses', or 'User Login Latency'.
Learning Objectives
By the end of this guide, you will be able to:
- Differentiate between standard metrics and custom metrics.
- Implement code using the
PutMetricDataAPI. - Use the CloudWatch Embedded Metric Format (EMF) for high-performance metric emission.
- Define and manage metric dimensions to balance granularity and cost.
- Understand metric resolution and its impact on monitoring.
Key Terms & Glossary
- Namespace: A container for CloudWatch metrics. You must provide a namespace when you create a custom metric (e.g.,
MyCompany/PaymentService). - Dimension: A name/value pair that is part of the identity of a metric. Adding dimensions helps you filter results (e.g.,
Environment=Production). - Resolution: Defines the granularity of data points. Standard resolution is 1 minute; high resolution is 1 second.
- Embedded Metric Format (EMF): A JSON specification used to instruct CloudWatch Logs to automatically extract metric data from log streams.
- Cardinality: The number of unique combinations of dimensions. High cardinality can lead to increased costs.
The "Big Idea"
The "Big Idea" behind custom metrics is observability beyond infrastructure. Standard metrics tell you if the server is healthy, but custom metrics tell you if the application is healthy. By emitting metrics directly from your code, you close the gap between resource performance and business logic success.
Formula / Concept Box
| Feature | PutMetricData (SDK) | Embedded Metric Format (EMF) |
|---|---|---|
| Mechanism | Synchronous HTTP API Call | Asynchronous via CloudWatch Logs |
| Throughput | Limited by API Throttling (150 tps) | High (limited only by log ingestion) |
| Cost | Charged per API call | No API charge for metrics (charged for logs) |
| Use Case | Low frequency, simple scripts | High throughput, Lambda functions |
Hierarchical Outline
- Metric Definition Elements
- MetricName: The specific name (e.g.,
OrderCount). - Unit: The type of value (e.g.,
Seconds,Bytes,Count,Percent). - Value: The numeric data point.
- MetricName: The specific name (e.g.,
- The SDK Approach (PutMetricData)
- Best for real-time tracking of specific events.
- Requires handling retries and batching.
- Limits: Maximum 20 metrics per call; maximum 150 requests per second.
- The Log-Based Approach (EMF)
- Preferred for AWS Lambda to avoid blocking execution.
- Requires structured JSON in a specific format.
- Automatically processed by CloudWatch backend.
Visual Anchors
Metric Delivery Architecture
Anatomy of a Metric Structure
\begin{tikzpicture}[node distance=1.5cm, every node/.style={draw, rectangle, rounded corners, inner sep=5pt, align=center}] \node (NS) {Namespace$e.g., MyApp/Frontend)}; \node (MN) [below of=NS] {Metric Name$e.g., Latency)}; \node (DIM) [right=of MN] {Dimensions$e.g., Region=US-East-1)}; \node (VAL) [below of=MN] {Value$e.g., 250)}; \node (UNT) [right=of VAL] {Unit$e.g., Milliseconds)}; \draw[->] (NS) -- (MN); \draw[->] (MN) -- (VAL); \draw[-] (MN) -- (DIM); \draw[-] (VAL) -- (UNT); \end{tikzpicture}
Definition-Example Pairs
- Dimension: A label to categorize data. Example: In a billing app, you might use a dimension
PaymentType=CreditCardto see if credit card processing is slower than other methods. - High-Resolution Metric: Metrics stored with 1-second granularity. Example: During a flash sale, you monitor
TransactionVolumeat high resolution to detect sub-minute spikes that could crash the DB. - Metric Batching: Sending multiple data points in one API call. Example: Accumulating 20 different success/failure counts in memory and sending them once every minute to reduce
PutMetricDatacosts.
Worked Examples
Example 1: Emitting a Metric with Python (boto3)
This example shows a synchronous call to track the number of processed files.
import boto3
cloudwatch = boto3.client('cloudwatch')
cloudwatch.put_metric_data(
Namespace='App/FileProcessor',
MetricData=[
{
'MetricName': 'FilesProcessed',
'Dimensions': [
{'Name': 'Environment', 'Value': 'Production'},
{'Name': 'Type', 'Value': 'PDF'}
],
'Value': 1.0,
'Unit': 'Count',
'StorageResolution': 60
}
]
)Example 2: EMF JSON Format (Manual)
If you aren't using the EMF library, you can simply print this JSON to stdout in a Lambda function.
{
"_aws": {
"Timestamp": 1615553000000,
"CloudWatchMetrics": [
{
"Namespace": "App/LambdaNode",
"Dimensions": [["FunctionName"]],
"Metrics": [{"Name": "ProcessingTime", "Unit": "Milliseconds"}]
}
]
},
"FunctionName": "processOrder",
"ProcessingTime": 45
}Checkpoint Questions
- What is the maximum number of metrics you can include in a single
PutMetricDatacall? - Why is EMF preferred over
PutMetricDatafor AWS Lambda functions? - What happens to the resolution if you omit the
StorageResolutionparameter in your code? - How does CloudWatch distinguish between the same
MetricNamethat has differentDimensions?
[!TIP] Answer Key:
- 20 metrics.
- It is asynchronous and doesn't introduce network latency to the function execution.
- It defaults to standard resolution (60 seconds).
- It treats them as unique metrics; a metric is defined by its name AND its dimensions.