AWS Messaging Services: SQS, SNS, and Decoupling Patterns
Queuing and messaging concepts (for example, publish/subscribe)
AWS Messaging Services: SQS, SNS, and Decoupling Patterns
This study guide covers the core concepts of asynchronous communication within AWS, focusing on Amazon Simple Queue Service (SQS) and Amazon Simple Notification Service (SNS) as required for the SAA-C03 exam.
Learning Objectives
- Differentiate between message queuing (point-to-point) and pub/sub (one-to-many) patterns.
- Identify appropriate use cases for SQS Standard vs. FIFO queues.
- Explain the mechanisms of decoupling and how they contribute to application resiliency.
- Configure visibility timeouts and dead-letter queues to handle processing failures.
- Architect fan-out scenarios using SNS and SQS together.
Key Terms & Glossary
- Producer: An application or service that sends data/messages into a queue or SNS topic.
- Consumer/Subscriber: The component that retrieves and processes the message (e.g., an EC2 instance or Lambda function).
- In-flight: A status in SQS where a message has been picked up by a consumer but is not yet deleted.
- Visibility Timeout: A window of time where SQS hides a message from other consumers to prevent duplicate processing.
- Dead-Letter Queue (DLQ): A secondary queue where messages are sent if they fail to process after a specific number of attempts.
- Fan-out: A design pattern where one message sent to an SNS topic is replicated and pushed to multiple endpoints (SQS, Lambda, HTTP).
The "Big Idea"
The core philosophy of modern cloud architecture is Decoupling. By placing a messaging layer between components, you break the direct dependency between them. If your web server (producer) is faster than your database (consumer), the queue acts as a buffer. If the consumer fails, the messages sit safely in the queue rather than being lost, ensuring the system is resilient and stateless.
Formula / Concept Box
| Feature | SQS Standard | SQS FIFO | Amazon SNS |
|---|---|---|---|
| Throughput | Nearly Unlimited | 300 - 3,000 msgs/sec | Nearly Unlimited |
| Ordering | Best-effort | Strict First-In-First-Out | No guaranteed order |
| Delivery | At-least-once | Exactly-once | Push-based (Fan-out) |
| Max Size | 256 KB | 256 KB | 256 KB |
| Retention | 1 min - 14 days | 1 min - 14 days | No retention (Transient) |
Hierarchical Outline
- Amazon Simple Queue Service (SQS)
- Architecture: Pull-based system where consumers poll for messages.
- Standard Queues: Default type; emphasizes throughput over strict ordering.
- FIFO Queues: Ensures messages are processed exactly once in the order sent.
- Visibility Timeout: Prevents "double-processing"; default is 30 seconds; max 12 hours.
- Polling:
- Short Polling: Returns immediately, even if empty (higher cost).
- Long Polling: Waits up to 20 seconds for a message (lower cost, better efficiency).
- Amazon Simple Notification Service (SNS)
- Architecture: Push-based "Pub/Sub" system.
- Topics: Named logical access points and communication channels.
- Subscribers: Can be SQS, Lambda, Email, SMS, or HTTP/S endpoints.
- Advanced Integration Patterns
- Fan-out: Publishing to SNS and having multiple SQS queues subscribe to handle different workflows.
- Amazon MQ: Managed broker for legacy applications using protocols like MQTT or AMQP (ActiveMQ).
Visual Anchors
The SQS Lifecycle
SNS Fan-out Architecture
\begin{tikzpicture}[node distance=2cm, every node/.style={draw, fill=blue!10, rounded corners, align=center, font=\small}] \node (sns) [fill=orange!20] {SNS Topic$Orders)}; \node (sqs1) [right=of sns, yshift=1cm] {SQS Queue$Shipping)}; \node (sqs2) [right=of sns, yshift=0cm] {SQS Queue$Invoicing)}; \node (sqs3) [right=of sns, yshift=-1cm] {SQS Queue$Analytics)};
\draw [->, thick] (sns) -- (sqs1);
\draw [->, thick] (sns) -- (sqs2);
\draw [->, thick] (sns) -- (sqs3);
\node (app1) [right=of sqs1] {Shipping Service};
\node (app2) [right=of sqs2] {Billing Service};
\node (app3) [right=of sqs3] {Data Warehouse};
\draw [dashed, ->] (sqs1) -- (app1);
\draw [dashed, ->] (sqs2) -- (app2);
\draw [dashed, ->] (sqs3) -- (app3);\end{tikzpicture}
Definition-Example Pairs
-
Concept: Statelessness
- Definition: Designing applications so that any server can handle any request because no client data is stored locally.
- Example: A video transcoding app where EC2 instances pick up "transcode jobs" from SQS. If an instance dies mid-task, the message returns to the queue for another instance to finish.
-
Concept: Dead-Letter Queue (DLQ)
- Definition: A holding area for messages that cannot be processed successfully after repeated attempts.
- Example: An order message contains a corrupted character that crashes the consumer code. Instead of looping forever (poison pill), SQS moves it to a DLQ for manual inspection.
Worked Examples
Problem: The Duplicate Payment Issue
Scenario: An e-commerce site is processing payments. Occasionally, due to network retries, the payment service receives the same order ID twice, leading to double-charging customers.
Solution using SQS FIFO:
- Change Queue Type: Migrate the payment queue from Standard to FIFO.
- Deduplication ID: Each message is sent with a
MessageDeduplicationId(e.g., the Order ID). - Effect: SQS FIFO will recognize the same ID within a 5-minute window and discard the duplicate message, ensuring exactly-once processing.
Problem: Real-time Image Processing
Scenario: When a user uploads a photo, it needs to be resized for mobile, watermarked, and scanned for inappropriate content simultaneously.
Solution using SNS Fan-out:
- SNS Topic: Create an
ImageUploadtopic. - Subscriptions: Create three separate SQS queues (Resize, Watermark, Scan) and subscribe them all to the topic.
- Process: The upload service sends one message to SNS. SNS replicates it to all three queues immediately. Three different sets of Lambda functions process the three tasks in parallel.
Checkpoint Questions
- What is the maximum message size for an SQS message? (Answer: 256 KB)
- If a consumer takes 40 seconds to process a message but the Visibility Timeout is 30 seconds, what happens? (Answer: The message becomes visible again and another consumer will likely pick it up, causing duplicate processing).
- Which service would you use to migrate an existing on-premises application that uses the MQTT protocol? (Answer: Amazon MQ).
- How long can a message remain in an SQS queue? (Answer: Up to 14 days; default is 4 days).
- Does SNS store messages if a subscriber is offline? (Answer: No, SNS is transient; if a subscriber is down and does not have a retry policy/SQS buffer, the message is lost).