AWS Networking: Mastering Access Logging for ELB and CloudFront
Access logging (for example, load balancers, CloudFront)
AWS Networking: Mastering Access Logging for ELB and CloudFront
Access logging is a critical pillar of network observability in AWS. It provides a detailed trail of every request reaching your Load Balancers or CloudFront distributions, enabling security auditing, performance troubleshooting, and traffic pattern analysis.
Learning Objectives
- Configure and enable access logging for Application, Network, and Classic Load Balancers.
- Distinguish between the log formats and data points captured by ALB, NLB, and CloudFront.
- Implement secure S3 bucket policies to allow AWS services to write logs.
- Optimize storage costs for log data using S3 Lifecycle policies.
- Identify appropriate tools (Athena, OpenSearch) for analyzing high-volume log data.
Key Terms & Glossary
- Access Log: A record of every request processed by a service, containing metadata like source IP, latency, and response codes.
- S3 Prefix: A logical grouping (folder-like structure) used to organize logs within an S3 bucket.
- Bucket Policy: A JSON-based resource policy attached to S3 to grant permissions (e.g., allowing ELB to
s3:PutObject). - Lifecycle Rule: An S3 configuration that automatically moves old logs to cheaper storage classes (like Glacier) or deletes them.
- Gzip Compression: A method used by ELB to reduce the file size of exported logs to save on storage costs.
The "Big Idea"
In a cloud environment, you don't "own" the wire; you own the metadata. Access logs act as the "black box" flight recorder for your network. While VPC Flow Logs tell you if packets moved, Access Logs tell you what the application did (the URL, the specific error, the latency). This distinction is the difference between knowing a connection failed and knowing exactly why a specific user got a 404 error.
Formula / Concept Box
| Feature | ELB Access Logs | CloudFront Access Logs |
|---|---|---|
| Default State | Disabled | Disabled |
| Storage Target | Amazon S3 | Amazon S3 |
| Delivery Interval | 5 or 60 minutes (configurable) | Typically within an hour |
| Format | Plaintext / Gzip | Plaintext / Gzip |
| Naming Convention | [AccountID]_elasticloadbalancing_[Region]_[LBName]_[Time]_[Random].log | [DistributionID].YYYY-MM-DD-HH.[Unique-ID].gz |
Hierarchical Outline
- I. Elastic Load Balancing (ELB) Logging
- Application Load Balancer (ALB): Captures Layer 7 details (HTTP headers, URI, User Agent).
- Network Load Balancer (NLB): Captures Layer 4 details (TCP/UDP flags, byte counts).
- Log Storage: Always stored in S3; requires a bucket policy for the ELB service principal.
- II. CloudFront Edge Logging
- Standard Logs: Provides details on edge request activity (Edge location ID, cache status).
- Real-time Logs: (Advanced) Sends logs to Kinesis Data Streams for sub-second analysis.
- III. Management and Analysis
- Cost Control: Use S3 Intelligent-Tiering or Glacier for long-term audit logs.
- Analysis Tools: Amazon Athena is the preferred serverless way to query logs using SQL.
Visual Anchors
Log Generation Flow
Log Storage Architecture
\begin{tikzpicture}[node distance=2cm, every node/.style={fill=white, font=\small}] \draw[thick] (0,0) rectangle (6,4); \node at (3,4.3) {S3 Bucket Structure};
\draw[fill=blue!10] (0.5,2.5) rectangle (5.5,3.5);
\node at (3,3) {Prefix: /logs/alb-primary/};
\draw[fill=green!10] (1,0.5) rectangle (5,2);
\node at (3,1.5) {YYYY/MM/DD/};
\node at (3,1) {LogFile_01.log.gz};
\draw[->, thick] (-1,3) -- (0.4,3);
\node[left] at (-1,3) {ELB Service};\end{tikzpicture}
Definition-Example Pairs
- Target Response Time: The time (in seconds) it took for the backend instance to respond to the LB.
- Example: If your log shows a response time of
5.001, your Python/Node.js app is likely the bottleneck, not the network.
- Example: If your log shows a response time of
- Edge Location ID: A unique code in CloudFront logs identifying which global data center served the content.
- Example:
IAD89-C1indicates the request was served from a Dulles, VA edge location.
- Example:
- Response Code: The HTTP status returned to the client.
- Example: A sudden spike in
403codes in the logs could indicate a WAF rule is blocking a specific range of malicious IPs.
- Example: A sudden spike in
Worked Examples
Enabling ALB Access Logs via Bucket Policy
To allow an ALB to write to a bucket, you must apply a policy. If your account is 123456789012 and the ELB Regional account ID (for US-East-1) is 127311923021:
- Create Bucket:
my-alb-logs-bucket. - Apply Policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::127311923021:root"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::my-alb-logs-bucket/AWSLogs/123456789012/*"
}
]
}[!IMPORTANT]
The Principal varies by region. Always check the AWS Documentation for the specific ELB Account ID for your region.
Checkpoint Questions
- Are Load Balancer access logs enabled by default? (No, they must be manually enabled).
- Where are CloudFront Standard access logs stored? (In an Amazon S3 bucket of your choice).
- What tool is best for searching through 100GB of ALB logs without provisioning servers? (Amazon Athena).
- Which load balancer type records the "reply size" in its access logs? (Network Load Balancer).
Muddy Points & Cross-Refs
- CloudWatch vs. Access Logs: New learners often confuse CloudWatch Metrics (graphs of counts) with Access Logs (raw request details). Access logs are for deep-dives; Metrics are for high-level alerting.
- Cost Implications: While the logging feature is free, the S3 storage and the GET/PUT requests to S3 are not. High-traffic sites can generate terabytes of logs quickly.
- Cross-Reference: For packet-level investigation of non-HTTP traffic, see Unit 4: VPC Traffic Mirroring.
Comparison Tables
| Log Type | Protocol Support | Main Use Case | Analysis Tool |
|---|---|---|---|
| ALB Access Logs | HTTP/HTTPS | Debugging 5XX errors, tracing URI paths | Athena |
| NLB Access Logs | TCP/TLS/UDP | Network performance, port-level auditing | Athena / OpenSearch |
| CloudFront Logs | HTTP/HTTPS/WebSockets | Caching efficiency, Edge performance | Athena / CloudFront Reports |
| VPC Flow Logs | IP (All) | Security Group/ACL debugging | CloudWatch Logs Insights |