☁️ AWS

Free AWS Certified Solutions Architect - Professional (SAP-C02) Study Resources

Step up to senior AWS architecture — lead multi-account, enterprise-scale cloud design and migration. Tackle organizational complexity, new-solution design, continuous improvement of existing solutions, and workload migration and modernization, with an AI tutor and professional-level mock exams. For experienced architects ready for the profession's most demanding AWS cert.

1,035

Practice Questions

Mock Exams

230

Study Notes

824

Flashcard Decks

Source Materials

Start Studying — Free

AWS Certified Solutions Architect - Professional (SAP-C02) Study Notes & Guides

230 AI-generated study notes covering the full AWS Certified Solutions Architect - Professional (SAP-C02) curriculum. Showing 10 complete guides below.

Study Guide945 words

Optimizing Operations: Adopting Managed Services & Reducing Infrastructure Overhead

Adopting managed services as needed to reduce infrastructure provisioning and patching overhead

Read full article

Optimizing Operations: Adopting Managed Services & Reducing Infrastructure Overhead

This guide explores the transition from manual infrastructure management to leveraging AWS managed services. By shifting the burden of provisioning, patching, and scaling to AWS, organizations can focus on application logic and business value.

Learning Objectives

After studying this guide, you should be able to:

Differentiate between mutable and immutable infrastructure strategies.
Explain how Infrastructure as Code (IaC) reduces configuration drift.
Assess the trade-offs between refactoring effort and operational cost savings when moving to managed services.
Design a patching strategy that integrates with CI/CD pipelines for immutable environments.

Key Terms & Glossary

Managed Service: An AWS service where the provider handles the underlying infrastructure, maintenance, and patching (e.g., Amazon RDS, AWS Fargate).
Infrastructure as Code (IaC): The management of infrastructure in a descriptive model, using the same versioning as DevOps teams use for source code (e.g., AWS CloudFormation).
Configuration Drift: The phenomenon where the environment's configuration deviates from the "source of truth" or initial template due to manual ad-hoc changes.
Immutable Infrastructure: A strategy where servers are never modified after deployment. If a change is needed, new servers are built from a common image and replace the old ones.
Undifferentiated Heavy Lifting: Tasks like racking servers or patching OS kernels that are necessary but do not provide a unique competitive advantage to a business.

The "Big Idea"

In traditional on-premises environments, servers are long-lived assets to be amortized. In the cloud, infrastructure is disposable. Adopting managed services is not just about technology; it is a mindset shift from "maintaining servers" to "consuming capabilities." Every hour spent patching an OS is an hour not spent improving your product. AWS managed services allow you to delegate this "undifferentiated heavy lifting" back to the provider.

Formula / Concept Box

Concept	Impact on Overhead	Strategic Requirement
EC2 (Self-Managed)	High (Manual Patching/Ops)	Lowest Refactoring
Containers (Fargate)	Medium (Image Patching)	Moderate Refactoring
Serverless (Lambda)	Low (AWS Managed Runtime)	High Rearchitecting

[!IMPORTANT] The Inverse Rule of Refactoring: The more advanced the managed service (e.g., Lambda), the higher the initial refactoring effort required, but the lower the long-term operational cost.

Hierarchical Outline

I. Infrastructure Provisioning via IaC
- Automation: Use tools like CloudFormation to ensure consistent environments.
- Disaster Recovery: Enables rapid recreation of the stack from a "clean slate" during outages.
II. Patching and Maintenance Strategies
- Mutable Approach: Patching live servers using AWS Systems Manager Patch Manager.
- Immutable Approach: Patching the AMI (Amazon Machine Image) or Container Image in the build phase of a CI/CD pipeline.
III. The Managed Service Spectrum
- Compute Optimization: Transitioning from EC2 to Fargate or Lambda.
- Storage Optimization: Moving from self-managed EBS/EC2 databases to Amazon RDS or DynamoDB.
IV. Modernization Opportunities
- Architecture Shift: Decoupling monoliths into microservices.
- Instruction Sets: Moving from x86 to AWS Graviton (ARM) for better price-performance.

Visual Anchors

Infrastructure Evolution Flow

Loading Diagram...

Figure 1 — Mermaid diagram

The Shared Responsibility Boundary

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Figure 2 — TikZ diagram

Definition-Example Pairs

Immutable Environment: A setup where updates are performed by replacing the entire stack rather than updating in place.
- Example: Instead of SSH-ing into a server to update Nginx, you trigger a CI/CD pipeline that builds a new AMI with the latest Nginx version and performs a Blue/Green deployment.
Infrastructure Drift: When manual changes make a server different from its original specification.
- Example: An engineer manually installs a security patch on one server in a cluster but forgets the others, causing a version mismatch during the next scaling event.

Worked Examples

Scenario: Migrating a Legacy Web App to Reduce Patching

1. Current State: A Java application runs on 10 EC2 instances. Every month, the sysadmin spends 8 hours manually applying Linux kernel patches and restarting services.

2. Strategy Selection:

Option A (Mutable): Use AWS Systems Manager (SSM) Patch Manager. Result: Automates the patching, but servers are still long-lived and susceptible to drift.
Option B (Immutable): Migrate to AWS Fargate. Result: AWS manages the underlying EC2 instances. The team only needs to update the Docker base image periodically.

3. Implementation Logic (Option B):

Step 1: Containerize the Java application.
Step 2: Use AWS CloudFormation to define the ECS Cluster and Fargate Service.
Step 3: Integrate image scanning in Amazon ECR to detect vulnerabilities.
Step 4: When a patch is needed, update the Dockerfile, push to ECR, and update the Fargate service task definition.

Outcome: Monthly manual patching time drops from 8 hours to 0 hours (automated via CI/CD).

Checkpoint Questions

Why is an immutable infrastructure approach easier to implement in the cloud than on-premises?
If a service is "Serverless," does patching still occur? If so, who performs it?
What is the main risk of performing manual "hot-fixes" on production EC2 instances?
Which AWS service would you use to define your infrastructure as a version-controlled template?

Muddy Points & Cross-Refs

Does Serverless mean NO patching?: No. Patching still happens, but it is performed by AWS. For Lambda, AWS patches the underlying OS and runtime. For Fargate, AWS patches the host OS, while you remain responsible for the container image.
Cost vs. Effort: Managed services often have higher per-unit costs but lower Total Cost of Ownership (TCO) because they reduce human labor costs.
Cross-Reference: For deeper dives into reliability and SLAs when using these services, see Chapter 6: Meeting Reliability Requirements.

Comparison Tables

Feature	Self-Managed (EC2)	Managed Containers (Fargate)	Serverless (Lambda)
OS Patching	Customer	AWS	AWS
Runtime Patching	Customer	Customer (in Image)	AWS
Scaling	Manual/Auto-Scaling Groups	Automatic (Task-based)	Fully Native/Automatic
Refactoring Need	Minimal (Lift & Shift)	Moderate	High
Cost Model	Hourly / Savings Plans	Per vCPU/GB per hour	Per Request / Duration

[!TIP] When evaluating services for the SAP-C02 exam, prioritize managed services (Fargate/Lambda/RDS) unless the requirement specifically mentions OS-level customization or legacy software that cannot be containerized.

Study Guide850 words

Study Guide: Alerting and Automatic Remediation Strategies

Alerting and automatic remediation strategies

Read full article

Alerting and Automatic Remediation Strategies

This study guide focuses on the design and implementation of automated responses to operational and security incidents within AWS, a core requirement for the AWS Certified Solutions Architect - Professional (SAP-C02) exam.

Learning Objectives

After studying this guide, you should be able to:

Evaluate the necessity of automation in scaling incident response for large-scale environments.
Design alerting workflows using Amazon CloudWatch, AWS Config, and Amazon EventBridge.
Implement automatic remediation strategies using AWS Systems Manager (SSM) Automation and AWS Lambda.
Distinguish between configuration-based remediation (AWS Config) and security-finding remediation (Security Hub/GuardDuty).
Leverage the Automated Security Response on AWS library for pre-built playbooks.

Key Terms & Glossary

AWS Config: A service that enables you to assess, audit, and evaluate the configurations of your AWS resources. It acts as a managed CMDB.
SSM Automation Runbook: A document that defines the actions that Systems Manager performs on your managed instances and other AWS resources.
EventBridge (formerly CloudWatch Events): A serverless event bus that makes it easy to connect applications using data from your own applications, integrated SaaS applications, and AWS services.
Security Hub: A cloud security capacity management service that performs security best practice checks, aggregates alerts, and enables automated remediation.
Remediation Action: A predefined or custom task (like a Lambda function or SSM runbook) triggered automatically when a resource is found to be non-compliant.

The "Big Idea"

In modern cloud architectures, manual intervention is the enemy of scale and reliability. Automatic remediation shifts the operational burden from humans to code. Instead of waiting for an engineer to receive an email and log in to a console, the system detects a drift from the "ideal state" (compliance or health) and executes a predefined script to correct it instantly. This reduces Mean Time to Remediation (MTTR) and ensures security policies are enforced 24/7 without exception.

Formula / Concept Box

Component	Role in Strategy	Key Service Example
Detection	Monitors state and identifies deviations.	AWS Config, Amazon GuardDuty
Routing	Connects the detection event to the logic.	Amazon EventBridge
Logic/Action	The actual code/steps to fix the issue.	AWS Systems Manager Automation, AWS Lambda
Notification	Informing stakeholders of the action taken.	Amazon SNS

Hierarchical Outline

Monitoring and Detection
- AWS Config: Tracks Configuration Items (CIs); evaluates compliance against rules (e.g., "Is encryption enabled?").
- Amazon GuardDuty: Intelligent threat detection monitoring for malicious activity (e.g., crypto-mining, unauthorized access).
- AWS Security Hub: Centralizes findings from GuardDuty, Macie, Inspector, and Config.
Alerting Mechanisms
- Event-Driven Architecture: Use EventBridge to route findings based on pattern matching.
- Custom Actions: Security Hub "Custom Actions" allow manual triggering of automated workflows from the console.
Remediation Execution
- SSM Automation: Preferred for infrastructure-level changes (e.g., stopping an instance, modifying S3 bucket policies).
- AWS Lambda: Preferred for complex, multi-step logic or calling external APIs.
Scaling and Best Practices
- Automated Security Response on AWS: A library of pre-built playbooks for FSBP and PCI-DSS standards.
- Risk-Based Remediation: Choosing between "Immediate Block" (high risk) vs. "Notify and Wait" (low risk).

Visual Anchors

Incident Response Flowchart

Loading Diagram...

Figure 1 — Mermaid diagram

Resource Monitoring State Diagram

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Figure 2 — TikZ diagram

Definition-Example Pairs

Configuration Drift: When a resource's settings change from the approved baseline.
- Example: An engineer manually turns off EBS encryption on a volume to test a performance issue and forgets to turn it back on.
Remediation Playbook: A documented and automated set of steps to resolve a specific security issue.
- Example: A playbook that identifies an S3 bucket with public "Read" access and immediately applies the PutPublicAccessBlock API call.
Idempotency: The property where an operation can be applied multiple times without changing the result beyond the initial application.
- Example: An SSM Runbook that ensures a specific IAM policy is attached. If the policy is already there, it does nothing and reports success.

Worked Examples

Case: Automating S3 Public Access Block

Scenario: Your organization prohibits public S3 buckets. You need to ensure any bucket that becomes public is automatically remediated.

Step 1: Detection: Enable the AWS Config managed rule s3-bucket-public-read-prohibited.
Step 2: Association: Link this rule to a Remediation Action.
Step 3: Action Choice: Select the document AWS-PublishPublicAccessBlockCustom.
Step 4: Parameters: Pass the BucketName from the Config event to the SSM document.
Result: When a user makes a bucket public, AWS Config detects it within minutes, triggers the SSM Runbook, and the bucket is set back to private automatically.

Checkpoint Questions

What is the primary difference between a "managed" AWS Config rule and a "custom" AWS Config rule?
How does Amazon EventBridge facilitate cross-account remediation strategies?
In Security Hub, what is required to trigger a "Custom Action"?
Why is AWS Systems Manager Automation often preferred over Lambda for simple resource modifications?

Muddy Points & Cross-Refs

Config vs. Security Hub: Students often confuse these. Remember: Config is for resource properties (is the setting right?); Security Hub is for findings (did something bad happen?).
EventBridge vs. SNS: Use SNS if a human needs to read an email. Use EventBridge if a system (Lambda/SSM) needs to take an action.
Permissions: Remediation fails most often due to the SSM Automation Role lacking the specific permissions (e.g., s3:PutBucketPolicy) to perform the fix.

Comparison Tables

Feature	AWS Config Remediation	Security Hub Remediation
Primary Trigger	Configuration change (Resource state)	Security finding (Alert/Event)
Automation Tool	SSM Automation (direct integration)	EventBridge -> Lambda/SSM
Best For	Compliance and Governance	Incident Response & Threat Hunting
Manual Option	Not typical (usually auto-triggered)	Custom actions (Manual trigger from console)

Study Guide925 words

AWS Usage Analysis & Resource Optimization Study Guide

Analyzing usage reports to identify underutilized and overutilized resources

Read full article

AWS Usage Analysis & Resource Optimization Study Guide

This guide focuses on the critical skill of analyzing usage reports to identify underutilized and overutilized resources, a core competency for the AWS Certified Solutions Architect - Professional (SAP-C02) exam.

Learning Objectives

After studying this guide, you should be able to:

Configure and interpret AWS Cost and Usage Reports (CUR) for granular analysis.
Use AWS Cost Explorer to identify spending patterns and anomalies.
Define the process of right-sizing and explain its importance in cloud economics.
Identify metrics that signal underutilization versus overutilization.
Implement a tagging strategy to facilitate cost allocation and reporting.

Key Terms & Glossary

Right-sizing: The process of matching instance types and sizes to your workload performance and capacity requirements at the lowest possible cost.
AWS Cost and Usage Reports (CUR): The most granular AWS billing tool, delivering CSV or Parquet files to an S3 bucket with hourly or daily detail.
Over-provisioning: Allocating more resources (CPU, RAM, Storage) than a workload actually requires, leading to wasted spend.
Under-provisioning: Allocating fewer resources than required, leading to performance bottlenecks or application failure.
Cost Allocation Tags: Metadata assigned to AWS resources used to track costs on a detailed level (e.g., by Department or Project).

The "Big Idea"

In traditional on-premises environments, over-provisioning is a "safety net" because hardware procurement is slow and expensive. In the cloud, this habit becomes a financial liability. Effective AWS architecture requires shifting from "capacity guessing" to "data-driven rightsizing." By analyzing usage reports, an architect transforms a static infrastructure into a dynamic, cost-efficient organism that scales with actual demand rather than theoretical peaks.

Formula / Concept Box

Concept	Metric / Rule of Thumb	Action
Idle Resources	CPU < 5% and Max Network < 5 KBps over 7 days	Terminate or Downsize
Underutilized	CPU < 20% consistently	Downsize (e.g., m5.large to m5.medium)
Overutilized	CPU > 80% or Memory Paging > 0	Upsize or Scale Out (Add instances)
CUR Delivery	S3 Bucket + Bucket Policy + CUR Definition	Enable for 100% Granularity

Hierarchical Outline

Usage Analysis Tools
- AWS Cost Explorer: Best for visual trends and 12-month forecasting.
- AWS CUR: Best for deep-dives using Amazon Athena or QuickSight.
- AWS Compute Optimizer: Uses Machine Learning to suggest specific right-sizing moves.
The Right-sizing Process
- Monitor: Collect CloudWatch metrics (CPU, RAM, Disk, Network).
- Analyze: Identify patterns (Steady state vs. Bursting).
- Optimize: Change instance families (e.g., T-series for bursty, M-series for general).
Governance and Metadata
- Tagging: Mandatory for mapping usage to business units.
- Billing Alarms: Proactive notification of unexpected usage spikes.

Visual Anchors

The Optimization Lifecycle

Loading Diagram...

Figure 1 — Mermaid diagram

Cost-Performance Trade-off

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Figure 2 — TikZ diagram

Definition-Example Pairs

Term: Horizontal Scaling
Definition: Adding or removing similar resources (e.g., more EC2 instances) to a pool.
Example: A web server group that adds 2 more instances during a Black Friday sale to handle high traffic and terminates them afterward.
Term: Vertical Scaling (Rightsizing)
Definition: Increasing or decreasing the power (CPU/RAM) of a single resource.
Example: Upgrading an RDS instance from db.t3.medium to db.r5.large because the database cache hit ratio is too low.

Worked Examples

Analyzing a CUR for EC2 Instances

Scenario: You notice a spike in your monthly bill. You query the CUR in Amazon Athena to find the culprit.

Step 1: Filter CUR data by line_item_usage_type. You see BoxUsage:m5.4xlarge accounts for 60% of spend.
Step 2: Correlate with CloudWatch. You find the CPUUtilization for these instances averages 4% over 30 days.
Step 3: Remediation. You determine the workload is memory-bound but only needs 16GB. You switch from m5.4xlarge (64GB RAM/16 vCPU) to r5.large (16GB RAM/2 vCPU).
Result: Performance remains stable while costs drop by approximately 80%.

Checkpoint Questions

What is the primary difference in data availability between Cost Explorer and CUR?
Why is "Lift and Shift" often the cause of over-provisioning in the cloud?
Which AWS service provides ML-based recommendations for right-sizing EC2 and Lambda?
True or False: To set up CUR, you must first create an S3 bucket and apply a specific bucket policy.

Muddy Points & Cross-Refs

[!TIP] Common Confusion: Students often confuse Cost Explorer with Trusted Advisor.

Cost Explorer is for analysis and reporting.

Trusted Advisor provides specific checks (e.g., "You have 5 idle load balancers").

Cross-References:

For automation of these tasks, see AWS Auto Scaling and AWS Instance Scheduler.
For purchasing models, review Savings Plans vs. Reserved Instances.

Comparison Tables

Feature	AWS Cost Explorer	AWS Cost & Usage Report (CUR)
Primary Use	Visual trends, quick insights	Granular data mining, deep analytics
Data Format	Dashboard/Graphs	CSV / Parquet (in S3)
Retention	12 months (standard)	Continuous (as long as S3 exists)
Granularity	Daily/Monthly (Hourly optional)	Hourly / Resource-level
Setup	Enabled by default	Requires S3 and IAM configuration

Study Guide1,145 words

AWS Application Integration: Architecting for Decoupling and Resiliency

Application integration (for example, Amazon SNS, Amazon SQS, AWS Step Functions)

Read full article

AWS Application Integration: Architecting for Decoupling and Resiliency

This guide explores the essential AWS integration services used to build modern, scalable, and resilient cloud-native applications. Understanding the nuances between messaging, event-driven architectures, and workflow orchestration is a core requirement for the AWS Certified Solutions Architect - Professional (SAP-C02) exam.

Learning Objectives

After studying this guide, you should be able to:

Distinguish between synchronous and asynchronous communication patterns.
Select the appropriate AWS integration service (SQS, SNS, EventBridge, Step Functions) based on specific architectural requirements.
Design decoupled architectures using the "Fan-out" and "Messaging" patterns.
Evaluate opportunities for modernization using serverless integration tools.

Key Terms & Glossary

Decoupling: The practice of ensuring that application components can operate independently. If one component fails or slows down, the others remain functional.
Fan-out: A pattern where a single message sent to a topic is pushed to multiple endpoints (e.g., SQS queues, Lambda functions, or HTTP endpoints) simultaneously.
Idempotency: The property of certain operations where they can be applied multiple times without changing the result beyond the initial application. Crucial for retry logic in distributed systems.
Orchestration: A centralized approach to managing complex workflows where a "coordinator" (like Step Functions) manages the state and sequence of tasks.
Choreography: A decentralized approach where components communicate via events (like EventBridge) without a central coordinator.
Dead Letter Queue (DLQ): A specialized SQS queue used to store messages that cannot be processed successfully after a certain number of retries.

The "Big Idea"

In traditional monolithic architectures, components are tightly coupled; a failure in the "Order Service" might bring down the "Shipping Service." The Big Idea of application integration is to move from a synchronous "chain" to an asynchronous "web." By using AWS integration services as buffers and translators, you build systems that are highly resilient, elastically scalable, and easier to modernize because each piece can evolve independently.

Formula / Concept Box

Feature	Amazon SQS	Amazon SNS	Amazon EventBridge	AWS Step Functions
Primary Model	Pull (Polling)	Push (Pub/Sub)	Push (Event Bus)	State Machine
Persistence	Durable (up to 14 days)	Ephemeral (Immediate)	Ephemeral (Retry up to 24h)	Durable State
Ordering	FIFO available	No (except with SQS FIFO)	No	Strict Sequencing
Target Count	1 consumer per message	Many (Fan-out)	Many (Rules/Filtering)	1 Workflow Path

Hierarchical Outline

Asynchronous Messaging Patterns
- Point-to-Point (Queueing): Buffering requests between producers and consumers (Amazon SQS).
- Publish/Subscribe (Broadcasting): Delivering one message to multiple interested parties (Amazon SNS).
Event-Driven Architectures
- Event Buses: Routing events based on content/rules (Amazon EventBridge).
- Schema Registry: Managing event structures to ensure compatibility.
Workflow Management
- Standard Workflows: For long-running, auditable processes (AWS Step Functions).
- Express Workflows: High-volume, short-duration executions (AWS Step Functions).
API & Specialized Integration
- GraphQL Integration: Unified data access (AWS AppSync).
- Legacy Protocols: Managed message brokers (Amazon MQ for ActiveMQ/RabbitMQ).

Visual Anchors

The Fan-out Pattern

This diagram illustrates how SNS acts as a dispatcher to multiple downstream SQS queues for parallel processing.

Loading Diagram...

Figure 1 — Mermaid diagram

SQS Queue Structure

The following TikZ diagram visualizes the buffer mechanism of an SQS queue where messages wait to be polled by consumers.

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Figure 2 — TikZ diagram

Definition-Example Pairs

Standard SQS Queue
- Definition: A queue offering near-unlimited throughput and at-least-once delivery, but no guarantee of strict ordering.
- Example: A photo-sharing app where users upload high-res images; SQS holds the image metadata while a background worker resizes them at its own pace.
Step Functions (Standard)
- Definition: A visual workflow service that uses state machines to coordinate multiple AWS services into serverless workflows.
- Example: An e-commerce checkout process that must check inventory, charge a credit card, and update a shipping database in a specific sequence with error handling.
Amazon EventBridge
- Definition: A serverless event bus that makes it easy to connect applications using data from your own apps, integrated SaaS apps, and AWS services.
- Example: When an S3 bucket receives a new file, EventBridge triggers a specific Lambda function only if the file name ends in ".pdf".

Worked Examples

Scenario: Modernizing a Monolithic Order System

The Problem: A company has a monolithic "OrderManager" that processes payments, sends emails, and updates inventory in a single synchronous function. If the payment gateway is slow, the whole application hangs.

The Solution:

Step 1: Use Amazon API Gateway to receive the order request.
Step 2: The API triggers a Lambda that puts the order data into an Amazon SNS Topic.
Step 3 (The Fan-out): Three SQS queues subscribe to the SNS topic:
- PaymentQueue: Processed by a Payment Worker.
- InventoryQueue: Processed by an Inventory Worker.
- EmailQueue: Processed by a Notification Worker.
Step 4 (Resiliency): If the PaymentQueue worker fails, the message stays in the queue (or goes to a DLQ) without affecting the EmailQueue or InventoryQueue.

Checkpoint Questions

Which service should you choose if you need to ensure that messages are processed exactly once and in the strict order they were received?
- Answer: Amazon SQS FIFO (First-In-First-Out) queue.
What is the primary difference between SNS and EventBridge for message routing?
- Answer: SNS is better for high-throughput fan-out to thousands of subscribers; EventBridge is better for complex rule-based filtering (content-based routing) and integrating with 3rd-party SaaS applications.
True or False: SQS consumers must poll the queue to retrieve messages.
- Answer: True. SQS is a pull-based service, unlike SNS which is push-based.

Muddy Points & Cross-Refs

SNS vs. SQS: A common point of confusion. Remember: SQS is a container (holds messages until you pull them); SNS is a post office (delivers copies immediately to anyone who asked).
Step Functions vs. Lambda: Use Lambda for short, discrete tasks; use Step Functions to stitch those tasks together into a "stateful" journey.
Further Study: Check the "AWS Well-Architected Framework: Reliability Pillar" for more on loose coupling.

Comparison Tables

Orchestration (Step Functions) vs. Choreography (EventBridge)

Feature	Orchestration (Step Functions)	Choreography (EventBridge)
Control	Centralized (The "Brain")	Decentralized (The "Network")
Visibility	Visualizes flow state and history	Events flow without a single visual path
Coupling	Slightly tighter (The coordinator knows all)	Very loose (Services just listen for events)
Best For	Complex multi-step business logic	Decoupling microservices and SaaS apps

[!TIP] For the Professional exam, look for keywords like "ordering," "high throughput," or "retry logic" to decide between SQS Standard and FIFO. If the requirement mentions "third-party SaaS" or "event schemas," lean toward EventBridge.

Study Guide1,050 words

Mastering AWS Application Migration Tools: SAP-C02 Study Guide

Application migration tools (for example, AWS Application Discovery Service, AWS Application Migration Service)

Read full article

Mastering AWS Application Migration Tools

This study guide covers the essential tools and strategies for migrating application workloads to AWS, specifically focusing on the AWS Application Discovery Service (ADS) and the AWS Application Migration Service (MGN) as required for the SAP-C02 exam.

Learning Objectives

After studying this module, you should be able to:

Differentiate between agent-based and agentless discovery methods using AWS Application Discovery Service.
Evaluate workloads according to the 7Rs migration strategy (Re-host, Re-platform, Refactor, etc.).
Explain the architectural flow of data in AWS Application Migration Service (MGN).
Apply security best practices, including encryption at rest and in transit, to migration workflows.
Select the appropriate tool (MGN vs. VMC on AWS) based on source infrastructure and business requirements.

Key Terms & Glossary

AWS MGN (Application Migration Service): The primary service recommended for lift-and-shift (re-host) migrations to AWS.
Agent-based Discovery: A method of collecting deep performance and dependency data by installing software directly on source servers.
Block-level Replication: A data transfer method that copies disk blocks rather than individual files, ensuring byte-for-byte consistency.
Staging Area VPC: A temporary environment in AWS where replication servers receive and write data to EBS volumes before the final cutover.
7Rs: A framework for categorizing migration strategies: Re-host, Re-platform, Refactor, Re-purchase, Retire, Retain, and Relocate.

The "Big Idea"

Migration is not a single event but a lifecycle. It begins with Discovery (understanding what you have), moves to Assessment (deciding the strategy via the 7Rs), and concludes with Execution (using tools like MGN to move bits). The "Professional" level architect must ensure this lifecycle is secure, cost-effective, and causes minimal downtime by selecting the right orchestration tool for the specific source environment.

Formula / Concept Box

Strategy (The 7Rs)	Key Characteristic	Tooling Example
Re-host	"Lift and Shift" with no changes	AWS MGN
Relocate	Move hypervisor-to-hypervisor	VMware Cloud on AWS
Re-platform	"Lift, tinker, and shift" (e.g., move to RDS)	AWS DMS / SCT
Refactor	Re-architect for cloud-native (Lambda/S3)	Manual Rewrite
Re-purchase	Switch to a SaaS model	Marketplace
Retire	Decommission the application	N/A
Retain	Keep on-premises for now	N/A

Hierarchical Outline

Phase 1: Discovery & Assessment
- AWS Application Discovery Service (ADS)
  - Agentless: Uses a connector on VMware; identifies VM inventory.
  - Agent-based: Installed on OS; identifies processes and network dependencies.
- AWS Migration Hub
  - Centralized dashboard to track migration progress across different tools.
Phase 2: Server Migration
- AWS Application Migration Service (MGN)
  - Replaces SMS (Server Migration Service) and CloudEndure.
  - Continuous block-level replication.
- VMware Cloud (VMC) on AWS
  - Specific for VMware-to-VMware "Relocate" strategy.
Phase 3: Security & Governance
- Encryption in Transit: Secured via TLS 1.2.
- Encryption at Rest: Managed via AWS KMS on Amazon EBS volumes.

Visual Anchors

The Migration Workflow

Loading Diagram...

Figure 1 — Mermaid diagram

MGN Architecture Detail

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Figure 2 — TikZ diagram

Definition-Example Pairs

Continuous Replication: The process of copying changes to the cloud in real-time as they happen at the source.
- Example: An e-commerce database server on-premises constantly writes new orders; MGN captures these sub-second changes so the cloud version is always up-to-date for cutover.
Test Mode: A state in MGN where a source server is launched in AWS for validation without stopping the original server.
- Example: Launching a production web server in a test VPC to ensure the database connection strings work in the new network environment before the actual migration weekend.

Worked Examples

Problem: Migrating a Legacy SQL Server

Scenario: A company has a 10TB SQL Server running on an old physical machine. They need to migrate it with less than 30 minutes of downtime.

Solution Steps:

Assessment: Use AWS Application Discovery Service (ADS) Agents to verify process dependencies (e.g., which apps talk to this SQL server).
Setup: Install the AWS MGN Replication Agent on the physical SQL server.
Replication: MGN begins a "Baseline" sync of the 10TB. This may take days, but the source remains live.
Continuous Sync: Once the baseline is done, MGN keeps the EBS volumes in the AWS Staging Area synced with new writes.
Cutover: During a maintenance window, stop the source SQL service, allow the final bits to sync (seconds), and trigger the "Cutover" launch in MGN. This converts the server into an EC2 instance.

Checkpoint Questions

Which service is the successor to AWS SMS and CloudEndure for re-hosting servers?
What is the primary difference in data collection between the ADS Agentless and Agent-based discovery?
Why is a "Staging Area VPC" used in AWS MGN instead of launching directly into production?
Which migration strategy (from the 7Rs) applies specifically to moving VMs to VMware Cloud on AWS?

Muddy Points & Cross-Refs

MGN vs. DMS: Use MGN for full server migrations (OS + Apps + Data). Use DMS (Database Migration Service) if you are only moving the database and want to change the engine (e.g., SQL Server to Aurora).
Agentless vs. Agent-based: Remember that Agentless discovery is fast but only sees metadata (CPU, RAM, Disk). Agent-based discovery is deep (it sees what software is installed and which network ports are active).
Network Security: All MGN traffic from the agent to the replication instance uses Port 1500 for data and Port 443 for control. Ensure firewall rules are updated.

Comparison Tables

Discovery Options

Feature	Agentless (Connector)	Agent-based
Platform Support	VMware vCenter only	Windows & Linux (Physical/Virtual)
Data Depth	Infrastructure Metadata	Process & Network Dependencies
Installation	Single OVA appliance	Every individual server
Use Case	Rapid initial inventory	Detailed dependency mapping

Server Migration Tools

Service	Strategy	Use Case
AWS MGN	Re-host	Default choice for most server migrations
VMC on AWS	Relocate	Rapid move for VMware clusters with no change in hypervisor
AWS App2Container	Re-platform	Converting existing ASP.NET or Java apps into containers

Study Guide950 words

Performance Optimization: Caching, Buffering, and Replicas

Applying design patterns to meet performance objectives with caching, buffering, and replicas

Read full article

Performance Optimization: Caching, Buffering, and Replicas

This guide covers the essential design patterns for meeting performance objectives in high-scale AWS environments, focusing on reducing latency and managing resource contention.

Learning Objectives

Evaluate the differences between cache-aside and write-through caching patterns.
Design architectures that utilize read replicas to eliminate resource contention between read and write operations.
Implement buffering mechanisms to smooth out traffic spikes and prevent system overload.
Select appropriate AWS services (ElastiCache, DAX, RDS, SQS) based on specific performance requirements.

Key Terms & Glossary

TTL (Time to Live): The duration for which an item is stored in a cache before it is considered expired and deleted.
Cache Hit/Miss: A 'hit' occurs when the requested data is found in the cache; a 'miss' occurs when the data must be fetched from the primary data store.
Read Replica: A copy of a database instance that handles read-only queries, reducing the load on the primary (source) database.
Throttling: The process of limiting the number of requests a service can handle to maintain stability.
Asynchronous Replication: A data-syncing method where the primary database does not wait for the replica to acknowledge receipt of data before proceeding.

The "Big Idea"

Performance optimization is not just about raw speed; it is about resource management. In a high-traffic system, the database is often the primary bottleneck. Design patterns like caching, buffering, and replicas act as "pressure relief valves" that move data closer to the user, distribute the workload across multiple nodes, or decouple the timing of requests from the timing of processing.

Formula / Concept Box

Concept	Metric / Rule	Significance
Cache Hit Ratio	$\text{Hit Ratio} = \frac{\text{Cache Hits}}{\text{Cache Hits} + \text{Cache Misses}}$	Higher ratios indicate a more effective caching strategy.
Sub-millisecond Latency	$< 1\text{ms}$	Required for real-time applications; necessitates in-memory solutions.
Read Contention Rule	$\text{Writes} \uparrow \implies \text{Reads} \downarrow$	High write volume locks tables/rows, slowing down reads.

Hierarchical Outline

Caching Strategies
- In-Memory Storage: Using RAM for sub-millisecond access (e.g., Redis, Memcached).
- Cache-Aside (Lazy Loading): Application manages the cache. Data is only loaded on a miss.
- Write-Through: Data is written to the cache and the database simultaneously.
Database Scaling & Replicas
- Vertical Scaling: Increasing CPU/RAM (simple but limited).
- Read Replicas: Offloading read traffic (RDS, Aurora).
- DAX (DynamoDB Accelerator): Integrated cache for DynamoDB.
Buffering & Decoupling
- SQS (Simple Queue Service): Buffer for spikes in write traffic.
- Kinesis/Firehose: Buffering streaming data before ingestion.

Visual Anchors

Caching Logic Flow

Loading Diagram...

Figure 1 — Mermaid diagram

Multi-Layer Performance Architecture

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Figure 2 — TikZ diagram

Definition-Example Pairs

Pattern: Cache-Aside
- Definition: The application checks the cache first. If data is missing, it fetches it from the DB and writes it to the cache for future use.
- Example: A news website caching the top story of the day only after the first visitor requests it.
Pattern: Buffering
- Definition: Using a message queue to store incoming requests so the downstream system can process them at its own pace.
- Example: An e-commerce system using SQS to hold order requests during a Black Friday sale to prevent the database from crashing.
Pattern: Read Replicas
- Definition: Creating read-only copies of a database to serve analytics or reporting queries.
- Example: A mobile app where users update their profiles (Primary) but millions of others view those profiles (Replicas).

Worked Examples

Scenario: The Overloaded Catalog

Problem: An e-commerce catalog page is loading slowly. CloudWatch shows 90% CPU usage on the RDS instance, specifically during read-heavy hours. Writes are steady but low.

Step-by-Step Solution:

Identify the Pattern: The bottleneck is read-contention.
Option A (Read Replica): Create an RDS Read Replica. Point the web application's "GET /catalog" endpoint to the replica endpoint. This offloads the high-CPU reads from the primary instance.
Option B (Caching): Deploy Amazon ElastiCache (Redis). Implement the Cache-Aside pattern for the catalog items.
Result: The database CPU drops to 20%, and the catalog page load time drops from 2 seconds to 50ms (for cached hits).

Checkpoint Questions

What is the main disadvantage of a Write-Through cache compared to Cache-Aside?
In Amazon RDS, does creating a Read Replica in a Single-AZ environment cause downtime?
Which AWS service provides sub-millisecond response times for DynamoDB?
When should you use a buffer (SQS) instead of a cache (ElastiCache)?

[!TIP] Answers: 1. Increased write latency (must write to two places). 2. It may cause a short I/O suspension. 3. DAX. 4. Use SQS when you need to smooth out spikes in writes or decouple processing; use ElastiCache to speed up reads.

Muddy Points & Cross-Refs

Caching vs. Replicas: Learners often confuse these. Remember: Caching is for speed (in-memory); Replicas are for volume (distribution of database load).
Asynchronous Lag: Read replicas are asynchronous. This means a user might write data to the primary and immediately try to read it from the replica, but the data hasn't arrived yet (Eventual Consistency).
See Also: Well-Architected Framework - Performance Efficiency Pillar.

Comparison Tables

Feature	Read Replicas	Caching (ElastiCache)	Buffering (SQS)
Primary Goal	Offload Reads	Reduce Latency	Decouple/Smooth Spikes
Data Type	Structured (Relational)	Key-Value / Objects	Messages/Tasks
Consistency	Eventual	Depends on Pattern	N/A (Processing order)
Code Change	Low (New endpoint)	Medium (Logic for hits/misses)	High (Async processing)

Study Guide925 words

AWS Migration Security: Best Practices & Implementation Guide

Applying the appropriate security methods to migration tools

Read full article

AWS Migration Security: Best Practices & Implementation Guide

This guide explores the critical security methods required when utilizing AWS migration tools such as AWS Application Migration Service (MGN), AWS Database Migration Service (DMS), and AWS Storage Gateway. Securing the migration path is essential to ensure data integrity and confidentiality during the transition from on-premises to the cloud.

Learning Objectives

By the end of this guide, you should be able to:

Implement network isolation for migration services using custom-managed VPCs.
Configure private connectivity via AWS PrivateLink and Direct Connect for secure data transfer.
Apply the principle of Least Privilege using IAM roles and attribute-based access control (ABAC).
Enforce multi-factor authentication (MFA) and tagging strategies to govern migration tool access.

Key Terms & Glossary

AWS PrivateLink: A technology that provides private connectivity between VPCs, AWS services, and on-premises applications on the Amazon network.
Least Privilege: The security discipline of granting only the minimum permissions necessary to perform a task.
ABAC (Attribute-Based Access Control): An authorization strategy that defines permissions based on attributes (tags) attached to users and AWS resources.
Interface VPC Endpoint: An elastic network interface with a private IP address from the IP address range of your subnet that serves as an entry point for traffic destined to a supported service.
AWS MGN (Application Migration Service): The primary service used to lift-and-shift applications to AWS with minimal changes.

The "Big Idea"

Security in migration is not just about the final destination; it is about protecting the transit lane. If migration tools are deployed in default VPCs or with overly permissive IAM roles, the data being moved is at risk before it even arrives. A secure migration treats the migration tool itself as a high-security workload, isolating it from the public internet and strictly controlling who (and what) can interact with it.

Formula / Concept Box

Principle	Implementation Method	Goal
Network Isolation	Custom VPC + PrivateLink	Prevent exposure to the public internet.
Identity Governance	IAM Roles + MFA	Ensure only authenticated, authorized actors can trigger migrations.
Resource Control	Tagging + ABAC	Scale security by allowing access based on project/environment tags.
Secure Transport	Direct Connect / VPN	Provide a dedicated, encrypted path for massive data volumes.

Hierarchical Outline

Network Security for Migration
- VPC Placement: Avoid default VPCs; use customer-managed VPCs with specific NACLs.
- Private Connectivity:
  - Use AWS PrivateLink for interface endpoints.
  - Leverage Direct Connect for consistent, private bandwidth.
Identity and Access Management (IAM)
- Least Privilege: Avoid * permissions; use service-specific actions.
- Identity-Based Policies: Use conditions to restrict access based on tags.
- MFA Enforcement: Required for high-privilege migration actions (e.g., deleting replication instances).
Data Protection & Tool Configuration
- AWS DMS: Launch replication instances within private subnets.
- AWS MGN: Use system transformation coupled with block-level data duplication.

Visual Anchors

Secure Migration Architecture

Loading Diagram...

Figure 1 — Mermaid diagram

IAM Policy Evaluation Logic

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Figure 2 — TikZ diagram

Definition-Example Pairs

Interface VPC Endpoint: A private entry point for AWS services without requiring an Internet Gateway.
- Example: Creating an interface endpoint for AWS DMS so that your on-premises database can send data to the replication instance without the traffic ever touching the public internet.
Attribute-Based Access Control (ABAC): Using tags to grant permissions.
- Example: An IAM policy that allows a user to start an AWS MGN migration only if the target server has the tag Environment: Development.

Worked Examples

Scenario: Securing AWS Storage Gateway with Tag-Based Policies

You need to ensure that only authorized administrators can describe file shares for resources tagged for migration.

Step 1: Tag the Resource Apply a tag to your Storage Gateway resource: AllowAccess: yes.

Step 2: Create the IAM Policy

json

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "storagegateway:ListTagsForResource",
        "storagegateway:ListFileShares",
        "storagegateway:DescribeNFSFileShares"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "aws:ResourceTag/AllowAccess": "yes"
        }
      }
    }
  ]
}

Step 3: Verification If the user attempts to list shares on a gateway tagged AllowAccess: no, the request will be denied implicitly despite having the storagegateway:ListFileShares action allowed globally in the policy block, because the Condition is not met.

Checkpoint Questions

Why should you avoid using the "Default VPC" for AWS DMS replication instances?
What is the benefit of using an Interface VPC Endpoint for AWS MGN compared to an Internet Gateway?
How does MFA enhance the security of the migration process?

Muddy Points & Cross-Refs

VPC Peering vs. PrivateLink: Students often confuse these. Remember: VPC Peering connects two entire networks; PrivateLink exposes a specific service (like a migration tool) privately into your VPC.
Least Privilege Overkill: It is tempting to use AdministratorAccess during a migration because it is a "temporary" project. Do not do this. Use Condition keys to limit the scope to specific migration regions or tags.

Comparison Tables

Feature	Public Internet	AWS Client VPN	AWS Direct Connect
Security Level	Low (Encrypted but exposed)	Medium (Private tunnel)	High (Physical isolation)
Performance	Unpredictable	Variable	Consistent / Dedicated
Cost	Low	Moderate	High
Best Use Case	Small, non-sensitive data	Remote admin access	Large-scale enterprise migration

Study Guide1,050 words

Architecting for Resilience: Automated Backups and Business Continuity

Architecting a backup solution that is automated, is cost-effective, and supports business continuity across multiple Availability Zones or AWS Regions

Read full article

Architecting for Resilience: Automated Backups and Business Continuity

This study guide focuses on designing automated, cost-effective backup solutions that ensure business continuity (BC) across multiple Availability Zones (AZs) and AWS Regions, aligned with the AWS Certified Solutions Architect - Professional (SAP-C02) domain.

Learning Objectives

By the end of this module, you should be able to:

Define and apply Recovery Time Objective (RTO) and Recovery Point Objective (RPO) to architectural decisions.
Compare and contrast the four primary Disaster Recovery (DR) strategies: Backup & Restore, Pilot Light, Warm Standby, and Multi-site Active/Active.
Design automated backup workflows using AWS Backup and Amazon S3.
Implement Infrastructure as Code (IaC) using AWS CloudFormation to ensure consistent multi-region environment replication.
Evaluate when to use Multi-AZ versus Multi-Region architectures based on workload requirements.

Key Terms & Glossary

RTO (Recovery Time Objective): The maximum acceptable delay between the interruption of service and restoration of service. (Example: An RTO of 2 hours means the system must be back up within 2 hours of a failure.)
RPO (Recovery Point Objective): The maximum acceptable amount of data loss measured in time. (Example: An RPO of 15 minutes means you can afford to lose at most 15 minutes of data updates.)
Cross-Region Replication (CRR): An S3 feature that automatically, asynchronously copies objects across buckets in different AWS Regions.
Infrastructure as Code (IaC): Managing and provisioning infrastructure through machine-readable definition files (e.g., CloudFormation) rather than manual hardware configuration.
Zonal vs. Regional Services: Zonal services (like EC2) are tied to a specific AZ; Regional services (like DynamoDB or S3) are managed by AWS across multiple AZs automatically.

The "Big Idea"

Business Continuity is not merely about having a copy of your data; it is about orchestration and automation. In the cloud, reliability is achieved by assuming failure will happen. By using Infrastructure as Code (IaC) to recreate the environment and automated data replication to keep it current, organizations can transition from expensive "idle" hardware to cost-effective, "on-demand" recovery environments.

Formula / Concept Box

Metric/Concept	Definition	Architectural Impact
RTO	"How long to fix it?"	Determines the level of automation and environment readiness (e.g., Pilot Light vs. Warm Standby).
RPO	"How much data loss?"	Determines the frequency and method of data replication (e.g., Snapshot frequency vs. Synchronous replication).
S3 Durability	99.999999999% (11 9's)	Makes S3 the definitive target for backup storage and CRR.

Hierarchical Outline

Foundational Backup Strategy
- Automation First: Use AWS Backup for centralized policy management across RDS, EBS, and DynamoDB.
- S3 as the Backbone: Leverage S3 for high durability and Lifecycle Policies for cost-optimization (transitioning to Glacier).
- Data Security: Implement KMS (Key Management Service) for server-side or client-side encryption of backups.
Disaster Recovery (DR) Patterns
- Backup & Restore: Lower cost, higher RTO/RPO. Manual or scripted restoration.
- Pilot Light: Minimal version of environment always running (Databases/Live data), while App servers are scaled on-demand via IaC.
- Warm Standby: Scaled-down but functional version of the full environment.
- Multi-site Active/Active: Zero downtime; traffic split between regions via Route 53 or Global Accelerator.
Cross-Region Continuity
- Identity & Access: Use IAM Roles and cross-account access for isolated recovery environments.
- Global Networking: Use Route 53 routing policies (Latency, Failover, Geoproximity) to manage traffic during regional disruptions.

Visual Anchors

DR Strategy Decision Flow

Loading Diagram...

Figure 1 — Mermaid diagram

Multi-AZ vs. Multi-Region Scope

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Figure 2 — TikZ diagram

Definition-Example Pairs

Definition: Pilot Light Strategy — Keeping a minimal version of a workload functional in a second region, primarily the data layer.
- Example: An application has its database replicated to a second region, but the EC2 instances are only provisioned via an Auto Scaling Group triggered by a Route 53 Health Check failure.
Definition: Drift Detection — A CloudFormation feature that identifies when resources have been modified outside of the stack template.
- Example: A developer manually changes a security group rule in the DR region; CloudFormation Drift Detection flags this so the IaC template can re-enforce the standard.

Worked Examples

Scenario: Optimizing Cost for a 4-Hour RTO

Problem: A company currently uses a Multi-site Active/Active setup for a non-critical internal tool. The monthly cost is $5,000. The business determines that a 4-hour RTO is acceptable. How should the architect redesign this for cost-effectiveness?

Step-by-Step Solution:

Analyze RTO: A 4-hour RTO does not require resources to be running in the second region (Warm/Multi-site).
Select Pattern: Transition to Pilot Light or Backup & Restore.
Implement Automation:
- Store all environment definitions in AWS CloudFormation.
- Use AWS Backup to create daily snapshots and copy them to the DR region.
Cost Result: By terminating the idle EC2 and RDS instances in the DR region and relying on S3 storage + on-demand restoration, the monthly cost drops to ~$200 for storage.

Checkpoint Questions

What is the primary difference between a Pilot Light and a Warm Standby strategy?
Which AWS service would you use to centrally manage backup policies across multiple AWS accounts in an Organization?
True or False: Using Infrastructure as Code (IaC) is only beneficial for initial deployment, not for Disaster Recovery.
Why is S3 considered the "backup destination of choice" for AWS services?

Muddy Points & Cross-Refs

Fate Sharing: A common confusion is why Multi-AZ isn't enough. Remember: While Multi-AZ protects against hardware/data center failure, Multi-Region protects against regional service outages or natural disasters.
Cross-Region Data Transfer Costs: Replication is not free. Always account for data transfer out (DTO) costs when architecting multi-region replication.
Deep Dive Reference: For more on automated recovery, see the AWS Well-Architected Framework: Reliability Pillar.

Comparison Tables

Strategy	RTO / RPO	Relative Cost	Complexity
Backup & Restore	Hours / 24h+	Low	Simple
Pilot Light	Minutes / Real-time data	Medium-Low	Moderate
Warm Standby	Seconds / Real-time data	Medium-High	High
Multi-site	Near Zero	Very High	Very High

[!IMPORTANT] Automation (IaC) is the bridge that makes low-cost strategies (Backup & Restore) viable by ensuring that environment restoration is repeatable and fast.

Hands-On Lab820 words

Lab: Building a Scalable Hub-and-Spoke Network with AWS Transit Gateway

Architect network connectivity strategies

Read full article

Lab: Building a Scalable Hub-and-Spoke Network with AWS Transit Gateway

This hands-on lab guides you through architecting a scalable network using AWS Transit Gateway (TGW). You will connect two separate VPCs (Spoke A and Spoke B) through a central hub to enable transitive routing, a core requirement for the AWS Certified Solutions Architect - Professional exam.

[!WARNING] Remember to run the teardown commands at the end of this lab to avoid ongoing charges for Transit Gateway attachments.

Prerequisites

An active AWS Account.
AWS CLI installed and configured with Administrator access.
Basic knowledge of VPC CIDR blocks and Route Tables.
Region: We will use us-east-1, but you can substitute with your preferred region.

Learning Objectives

Provision a hub-and-spoke network topology using AWS Transit Gateway.
Configure VPC route tables to enable communication across the Transit Gateway.
Verify transitive connectivity between isolated workloads.
Understand the performance benefits of Transit Gateway over complex VPC peering meshes.

Architecture Overview

We will build a hub-and-spoke model where the Transit Gateway acts as the central router connecting two isolated VPCs.

Loading Diagram...

Figure 1 — Mermaid diagram

Step-by-Step Instructions

Step 1: Create Spoke VPCs

First, we need two VPCs with non-overlapping IP ranges.

bash

# Create Spoke VPC A
aws ec2 create-vpc --cidr-block 10.1.0.0/16 --tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=brainybee-spoke-a}]'

# Create Spoke VPC B
aws ec2 create-vpc --cidr-block 10.2.0.0/16 --tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=brainybee-spoke-b}]'

▶Console alternative

Navigate to

VPC > Your VPCs > Create VPC

. Use Name:

brainybee-spoke-a

and CIDR:

10.1.0.0/16

. Repeat for Spoke B with

10.2.0.0/16

Step 2: Provision the Transit Gateway

The Transit Gateway will serve as our regional network hub.

bash

aws ec2 create-transit-gateway --description "Hub for Spoke A and B" --tag-specifications 'ResourceType=transit-gateway,Tags=[{Key=Name,Value=brainybee-tgw}]'

[!TIP] Note the TransitGatewayId from the output; you will need it for the next steps.

Step 3: Attach VPCs to the Transit Gateway

We must "plug" our VPCs into the hub using TGW Attachments.

bash

# Attach Spoke A
aws ec2 create-transit-gateway-vpc-attachment \
    --transit-gateway-id <TGW_ID> \
    --vpc-id <VPC_A_ID> \
    --subnet-ids <SUBNET_A_ID>

# Attach Spoke B
aws ec2 create-transit-gateway-vpc-attachment \
    --transit-gateway-id <TGW_ID> \
    --vpc-id <VPC_B_ID> \
    --subnet-ids <SUBNET_B_ID>

Step 4: Configure VPC Routing

Even with an attachment, instances don't know where to send traffic. We must update the VPC Route Tables to point the CIDR of the other VPC to the Transit Gateway.

bash

# In VPC A's Route Table: Route to VPC B goes to TGW
aws ec2 create-route --route-table-id <RT_A_ID> --destination-cidr-block 10.2.0.0/16 --gateway-id <TGW_ID>

# In VPC B's Route Table: Route to VPC A goes to TGW
aws ec2 create-route --route-table-id <RT_B_ID> --destination-cidr-block 10.1.0.0/16 --gateway-id <TGW_ID>

Checkpoints

Verify TGW State: Run aws ec2 describe-transit-gateways. The state should be available.
Verify Attachments: Run aws ec2 describe-transit-gateway-vpc-attachments. You should see two attachments in the available state.
Ping Test: If you launch EC2 instances in both VPCs (with appropriate Security Groups allowing ICMP), a ping from 10.1.x.x to 10.2.x.x should succeed.

Visualizing the Route Logic

Below is a TikZ diagram representing the packet flow decision for an instance in VPC A trying to reach VPC B.

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Figure 2 — TikZ diagram

Troubleshooting

Problem	Potential Cause	Fix
Ping Timeout	Security Group / NACL	Ensure SG allows inbound ICMP from the other VPC's CIDR range.
Attachment "Pending"	AWS internal provisioning	Wait 2-3 minutes; TGW attachments take longer than VPC creation.
Route "Blackhole"	Deleted TGW	Ensure the TGW ID in the route table still exists and is attached.

Stretch Challenge

Scenario: You need to provide internet access to both spokes through a single centralized Inspection VPC.

Create a third VPC called Inspection-VPC with an Internet Gateway.
Modify Spoke route tables to point 0.0.0.0/0 to the TGW.
Configure TGW route table to route default traffic to the Inspection-VPC attachment.

Cost Estimate

Transit Gateway (us-east-1): $0.05 per hour.
TGW VPC Attachment: $0.05 per hour per attachment (Total $0.10/hr for this lab).
Data Processing: $0.02 per GB processed by the TGW.
Estimated Total for 1 Hour: ~$0.15 (Free-tier does not cover Transit Gateway).

Concept Review

Feature	VPC Peering	Transit Gateway
Topology	Point-to-Point (Mesh)	Hub-and-Spoke
Transitive Routing	No	Yes
Scalability	Complex at scale (N*(N-1)/2)	Highly Scalable (up to 5000 VPCs)
Complexity	High (many peerings)	Low (central management)

Clean-Up / Teardown

To avoid ongoing costs, delete resources in this specific order:

Delete EC2 Instances (if any were created for testing).
Delete VPC Attachments:
bash
aws ec2 delete-transit-gateway-vpc-attachment --transit-gateway-attachment-id <ATTACH_ID_A> aws ec2 delete-transit-gateway-vpc-attachment --transit-gateway-attachment-id <ATTACH_ID_B>
Delete Transit Gateway:
bash
aws ec2 delete-transit-gateway --transit-gateway-id <TGW_ID>
Delete VPCs:
bash
aws ec2 delete-vpc --vpc-id <VPC_A_ID> aws ec2 delete-vpc --vpc-id <VPC_B_ID>

Study Guide980 words

Mastering AWS Network Connectivity Strategies (SAP-C02)

Architect network connectivity strategies

Read full article

Mastering AWS Network Connectivity Strategies (SAP-C02)

Learning Objectives

After studying this guide, you should be able to:

Evaluate and select appropriate connectivity options for multiple VPCs (Peering vs. Transit Gateway).
Design resilient hybrid architectures using AWS Direct Connect (DX) and Site-to-Site VPN.
Calculate IPv4 subnet requirements while accounting for AWS-reserved addresses and future growth.
Implement high-availability patterns for DNS resolution and service integration using PrivateLink.
Optimize network performance using Equal Cost Multi-Path (ECMP) and Transit Gateway.

Key Terms & Glossary

Transit Gateway (TGW): A network transit hub that connects VPCs and on-premises networks through a central managed gateway.
Direct Connect (DX): A dedicated, private network connection from a corporate data center to AWS, bypassing the public internet.
AWS PrivateLink: Technology that provides private connectivity between VPCs, AWS services, and on-premises applications without exposing traffic to the public internet.
Route 53 Resolver: A regional service that enables recursive DNS queries between VPCs and on-premises networks in a hybrid cloud environment.
ECMP (Equal Cost Multi-Path): A routing strategy that allows for increased bandwidth by balancing traffic across multiple paths (e.g., multiple VPN tunnels).

The "Big Idea"

In a complex organizational environment, network connectivity is the "nervous system" of the architecture. It is not just about moving bits; it is about creating a future-proof, scalable, and resilient topology that balances performance requirements with cost and operational complexity. Choosing a hub-and-spoke model (Transit Gateway) over a mesh model (VPC Peering) is a "one-way door" decision that dictates how the organization scales for years to come.

Formula / Concept Box

Concept	Rule / Constraint
Subnet Reservations	AWS reserves 5 IP addresses per subnet (x.x.x.0, .1, .2, .3, and .255).
VPN Bandwidth	Each Site-to-Site VPN tunnel is limited to 1.25 Gbps.
Scaling VPN	Total Bandwidth = $1.25 Gbps $\times n$$ (where$ n$ is the number of tunnels using ECMP).
Direct Connect Speed	Available in 1 Gbps, 10 Gbps, or 100 Gbps (Hosted: 50 Mbps to 10 Gbps).

Hierarchical Outline

I. Inter-VPC Connectivity
- VPC Peering: Point-to-point, non-transitive, no bottleneck, lowest cost.
- Transit Gateway (TGW): Hub-and-spoke, supports transitive routing, simplifies management at scale.
II. Hybrid Connectivity
- Site-to-Site VPN: Fast to deploy, encrypted over public internet, 1.25 Gbps limit per tunnel.
- Direct Connect (DX): Consistent performance, high bandwidth, private (not encrypted by default).
- Resiliency Patterns: DX as primary with VPN as cost-effective failover.
III. Service Integration & DNS
- Interface Endpoints (PrivateLink): Private access to AWS services via ENIs in your subnets.
- Route 53 Resolver: Inbound/Outbound endpoints for hybrid DNS resolution.
IV. IP Address Management
- CIDR Planning: Ensure non-overlapping blocks across the organization.
- Expansion: Leave room for Elastic Load Balancers (ELB), RDS, and container services.

Visual Anchors

Transit Gateway Hub-and-Spoke Topology

Loading Diagram...

Figure 1 — Mermaid diagram

Hybrid Connectivity Architecture

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Figure 2 — TikZ diagram

Definition-Example Pairs

Transitive Routing: The ability for traffic to pass through a middle-hop to reach a destination.
- Example: If VPC A is connected to a Transit Gateway, and VPC B is also connected, VPC A can reach VPC B through the TGW without a direct peer.
Interface VPC Endpoint: A private entry point to an AWS service using an ENI with a private IP address.
- Example: Allowing an EC2 instance in a private subnet to upload files to an S3 bucket without using an Internet Gateway.
Anycast Routing: A network addressing and routing method in which incoming requests can be routed to a variety of different nodes.
- Example: Route 53 uses Anycast to ensure DNS queries are answered from the closest edge location to the user.

Worked Examples

Example 1: Calculating Usable IPs

Scenario: You create a subnet with a CIDR of 10.0.1.0/28. How many EC2 instances can you launch?

Step 1: Calculate total addresses: $2^{(32-28)} = 2^4 = 16$ .
Step 2: Subtract AWS reserved addresses: $16 - 5 = 11$. Answer: 11 usable IP addresses.

Example 2: High Bandwidth VPN Failover

Scenario: A company needs 4 Gbps of bandwidth for failover from their Direct Connect. A single VPN tunnel only provides 1.25 Gbps. Solution:

Deploy an AWS Transit Gateway.
Establish 4 Site-to-Site VPN connections.
Enable ECMP (Equal Cost Multi-Path) on the TGW.
The traffic will be balanced across the 4 tunnels, providing a total aggregate bandwidth of 5 Gbps.

Checkpoint Questions

What are the 5 specific IP addresses reserved by AWS in every subnet?
Why is Transit Gateway preferred over VPC Peering for large-scale organizations with hundreds of VPCs?
If a workload requires consistent 10 Gbps throughput and low latency, which connectivity option should be selected?
How does AWS PrivateLink improve the security posture of an application?

Muddy Points & Cross-Refs

Transitive Routing (VPC Peering): A common mistake is assuming VPC Peering is transitive. If VPC A peers with B, and B peers with C, A cannot talk to C. You must use Transit Gateway for this.
DX vs. DX Gateway: Remember that Direct Connect is the physical/logical link, while the DX Gateway is the global resource that allows a single DX to connect to VPCs in any AWS region.
Public vs. Private VIFs: A Private Virtual Interface (VIF) is for VPC resources; a Public VIF is for public endpoints like S3 or DynamoDB over Direct Connect.

Comparison Tables

VPC Peering vs. Transit Gateway

Feature	VPC Peering	Transit Gateway
Topology	Mesh (Point-to-Point)	Hub-and-Spoke
Management	Difficult at scale	Centralized/Simple
Transitive	No	Yes
Cost	No hourly charge (Data only)	Hourly charge + Data processing
Performance	No aggregate bottleneck	50 Gbps per VPC attachment

Security Groups vs. Network ACLs

Feature	Security Groups	Network ACLs
Level	Instance (ENI)	Subnet
Statefulness	Stateful (Return traffic allowed)	Stateless (Must allow both ways)
Rules	Allow rules only	Allow and Deny rules
Processing	All rules evaluated	Rules processed in order

More Study Notes (190)

AWS Rightsizing Strategy & Performance Optimization Guide

Assessing solutions and applying rightsizing based on requirements

945 words

AWS Asset Planning & Workload Migration Study Guide

Asset planning

880 words

Mastering the Principle of Least Privilege: Auditing and Implementation Guide

Auditing an environment for least privilege access

948 words

Automated Monitoring and Remediation Strategies in AWS

Automated monitoring and remediation strategies (for example, AWS Config rules)

985 words

AWS Auto Scaling Policies and Events: Master Study Guide

Auto scaling policies and events

945 words

Mastering AWS Cost Management & Monitoring Tools

AWS cost and usage monitoring tools (for example, AWS Cost Explorer, AWS Trusted Advisor, AWS Pricing Calculator)

890 words

Mastering AWS Cost Management: Monitoring, Analysis, and Optimization Tools

AWS cost and usage monitoring tools (for example, AWS Trusted Advisor, AWS Pricing Calculator, AWS Cost Explorer, AWS Budgets)

985 words

AWS Global Infrastructure: A Foundation for Resilient Architectures

AWS Global Infrastructure

920 words

AWS Global Infrastructure: Architecture for High Availability and Resilience

AWS Global Infrastructure

945 words

AWS Global Infrastructure: Design for Reliability and Performance

AWS Global Infrastructure

945 words

Mastering AWS Global Infrastructure for Resilience and Performance

AWS Global Infrastructure

985 words

AWS Identity and Access Management (IAM) & Identity Center Study Guide

AWS Identity and Access Management (IAM) and AWS IAM Identity Center

1,050 words

AWS Managed Security Services: Shield, WAF, GuardDuty, and Security Hub

AWS managed security services (for example, AWS Shield, AWS WAF, Amazon GuardDuty, AWS Security Hub)

925 words

AWS Managed Service Offerings: Modernization & Efficiency

AWS managed service offerings

1,045 words

AWS Global Networking & Route 53: SAP-C02 Study Guide

AWS networking concepts (for example, Amazon Route 53, routing methods)

985 words

Advanced AWS Networking and Hybrid Connectivity: SAP-C02 Study Guide

AWS networking concepts (for example, Amazon Virtual Private Cloud [Amazon VPC], AWS Direct Connect, AWS VPN, transitive routing, AWS container services)

1,142 words

AWS Networking & DNS: Architecting for Organizational Complexity

AWS networking services and DNS (for example, AWS Direct Connect, AWS Site-to-Site VPN, Amazon Route 53)

925 words

Study Guide: AWS Organizations and AWS Control Tower

AWS Organizations and AWS Control Tower

1,150 words

AWS Purchasing Options: Cost Optimization Strategy Guide

AWS purchasing options (for example, Reserved Instances, Savings Plans, Spot Instances)

945 words

AWS Resource Sharing Across Environments: Study Guide

AWS resource sharing across environments

945 words

AWS Rightsizing Visibility: AWS Compute Optimizer and S3 Storage Lens

AWS rightsizing visibility tools (for example, AWS Compute Optimizer, Amazon Simple Storage Service [Amazon S3] Storage Lens)

1,350 words

AWS Security, Identity, and Compliance: Tools & Governance

AWS security, identity, and compliance tools (for example, AWS CloudTrail, AWS Identity and Access Management Access Analyzer, AWS Security Hub, Amazon Inspector)

1,050 words

Mastering AWS Service Endpoints: A Comprehensive Study Guide

AWS service endpoints

920 words

AWS Storage Services and Replication Strategies: SAP-C02 Study Guide

AWS storage services and replication strategies (for example Amazon S3, Amazon RDS, Amazon ElastiCache)

1,142 words

AWS Storage Services & Hybrid Integration Study Guide

AWS storage services (for example, Amazon EBS, Amazon EFS, Amazon FSx, Amazon S3, AWS Storage Gateway Volume Gateway)

1,150 words

AWS Storage Services Strategy: S3, EFS, EBS, and FSx

AWS storage services (for example, Amazon S3, Amazon EFS)

948 words

AWS Backup Practices & Methods: Comprehensive Study Guide

Backup practices and methods

945 words

Mastering AWS Change Management Processes

Change management processes

920 words

CI/CD Pipelines and Advanced Deployment Strategies

CI/CD pipelines and deployment strategies (for example, blue/green, all-at-once, rolling)

820 words

Mastering Application Migration Assessment: AWS SAP-C02 Study Guide

Completing an application migration assessment

945 words

AWS Compute Services: EC2, Elastic Beanstalk, and Beyond

Compute services (for example, Amazon EC2, AWS Elastic Beanstalk)

1,050 words

AWS Configuration Management & Systems Administration

Configuration management tools (for example, AWS Systems Manager)

1,050 words

AWS Configuration Management: Systems Manager, Config, and OpsWorks

Configuration management tools (for example, AWS Systems Manager)

1,150 words

AWS Database Replication: Mastery Guide for DMS and SCT

Configuring data and database replication

1,145 words

Comprehensive Study Guide: Configuring Disaster Recovery Solutions on AWS

Configuring disaster recovery solutions

1,055 words

AWS Container Services: Comprehensive Study Guide (SAP-C02)

Containers (for example, Amazon ECS, Amazon EKS, AWS Fargate, Amazon ECR)

925 words

AWS Container Services: ECS, EKS, and Fargate Study Guide

Containers (for example, Amazon ECS, Amazon EKS, Fargate)

895 words

CI/CD: Strategies for High-Velocity Software Delivery

Continuous integration and continuous delivery (CI/CD)

945 words

AWS Cost-Conscious Architecture Study Guide

Cost-conscious architecture choices (for example, using Spot Instances, scaling policies, and rightsizing resources)

985 words

Mastering AWS Cost Management: Alerting and Reporting

Cost management, alerting, and reporting

1,055 words

Credential Management Services: Secure Strategies & Implementation

Credential management services

945 words

Data Backup and Restoration: AWS Business Continuity Study Guide

Data backup and restoration

1,152 words

Mastering AWS Database Migration: AWS DMS and AWS SCT

Database migration tools (for example, AWS DMS, AWS SCT)

925 words

Mastering AWS Database Architectures: SAP-C02 Study Guide

Databases (for example, Amazon DynamoDB, Amazon OpenSearch Service, Amazon RDS, self-managed databases on Amazon EC2)

1,120 words

AWS Data Migration: Online and Offline Strategies

Data migration options and tools (for example, AWS DataSync, AWS Transfer Family, AWS Snow Family, Amazon S3 Transfer Acceleration)

920 words

AWS Data Replication Methods & Disaster Recovery Strategy

Data replication methods

945 words

Mastering Data Protection: Classification, Retention, and Compliance

Data retention, data sensitivity, and data regulatory requirements

875 words

Mastering AWS Data Transfer Costs: Architect's Study Guide

Data transfer costs

948 words

AWS Encryption Strategies: Protecting Data at Rest and in Transit

Deploying encryption strategies for data at rest and data in transit

1,145 words

AWS Lab: Implementing Blue/Green Deployments with CloudFormation and Route 53

Design a deployment strategy to meet business requirements

895 words

Mastering AWS Deployment Strategies: SAP-C02 Study Guide

Design a deployment strategy to meet business requirements

925 words

Lab: Architecting a Secure Multi-Account Environment with AWS Organizations

Design a multi-account AWS environment

890 words

Mastering Multi-Account AWS Architecture: SAP-C02 Study Guide

Design a multi-account AWS environment

945 words

AWS SAP-C02: Designing for Business Continuity

Design a solution to ensure business continuity

920 words

Lab: Implementing a Pilot Light Disaster Recovery Strategy on AWS

Design a solution to ensure business continuity

945 words

Lab: Designing for Performance with Auto Scaling and ElastiCache

Design a solution to meet performance objectives

820 words

Mastering Performance: Designing High-Efficiency AWS Architectures

Design a solution to meet performance objectives

1,150 words

AWS Certified Solutions Architect - Professional: Designing for Reliability

Design a strategy to meet reliability requirements

920 words

Lab: Designing and Testing a Reliable Multi-AZ Web Architecture

Design a strategy to meet reliability requirements

1,054 words

Resilience and Availability: Designing for Disruption in AWS

Designing an architecture that provides application and infrastructure availability in the event of a disruption

1,150 words

Comprehensive Guide to Designing and Implementing a Backup Process

Designing and implementing a backup process

1,150 words

Study Guide: Designing and Implementing a Patch and Update Process

Designing and implementing a patch and update process

865 words

Designing an Effective Backup and Restoration Strategy

Designing an effective backup and restoration strategy

895 words

Designing Elastic & Performance-Optimized Architectures for Business Objectives

Designing an elastic architecture based on business objectives

925 words

Designing a Rightsizing Strategy: AWS Cost Optimization Study Guide

Designing a rightsizing strategy

860 words

AWS Study Guide: Designing Billing Alarms and Usage Monitoring

Designing billing alarms based on expected usage patterns

825 words

Mastering Disaster Recovery: RTO and RPO Strategy Guide

Designing disaster recovery solutions based on RTO and RPO requirements

920 words

Designing Highly Available Application Environments

Designing highly available application environments based on business requirements

1,050 words

Mastering Large-Scale Application Architectures: Performance and Scalability (SAP-C02)

Designing large-scale application architectures for a variety of access patterns

1,240 words

Design Reliable and Resilient Architectures (SAP-C02)

Design reliable and resilient architectures

850 words

Lab: Building a Resilient Multi-AZ Architecture on AWS

Design reliable and resilient architectures

925 words

Lab: Implementing AWS Cost Optimization and Governance

Determine a cost optimization strategy to meet solution goals and objectives

950 words

Mastering Cost Optimization: AWS Solutions Architect Professional (SAP-C02)

Determine a cost optimization strategy to meet solution goals and objectives

1,142 words

Architectural Design for Existing Workloads (SAP-C02)

Determine a new architecture for existing workloads

945 words

Lab: Re-architecting Legacy Workloads to AWS Managed Services

Determine a new architecture for existing workloads

850 words

Lab: Building Self-Healing Infrastructure for Operational Excellence

Determine a strategy to improve overall operational excellence

920 words

Mastering Operational Excellence: AWS SAP-C02 Study Guide

Determine a strategy to improve overall operational excellence

1,050 words

Lab: Implementing a High-Performance Auto-Scaling Architecture on AWS

Determine a strategy to improve performance

950 words

Optimizing Performance for Existing Solutions (SAP-C02)

Determine a strategy to improve performance

1,150 words

AWS Lab: Implementing Reliable Architectures with Auto Scaling and Load Balancing

Determine a strategy to improve reliability

925 words

Continuous Improvement: Strategies for Improving Reliability

Determine a strategy to improve reliability

850 words

Continuous Security Improvement: Strategies & Automation (SAP-C02)

Determine a strategy to improve security

1,085 words

Lab: Implementing Automated Security Remediation and Secrets Management

Determine a strategy to improve security

925 words

Lab: Implementing AWS Cost Visibility and Governance

Determine cost optimization and visibility strategies

865 words

Mastering AWS Cost Optimization and Visibility (SAP-C02)

Determine cost optimization and visibility strategies

925 words

AWS Modernization and Enhancements: Decoupling and Microservices

Determine opportunities for modernization and enhancements

920 words

Lab: Modernizing Monolithic Workloads using Serverless Decoupling

Determine opportunities for modernization and enhancements

920 words

AWS Certified Solutions Architect - Professional: Determining Security Controls

Determine security controls based on requirements

1,100 words

Lab: Implementing Least Privilege and Private Connectivity on AWS

Determine security controls based on requirements

1,150 words

AWS Migration Strategy Guide: Determining the Optimal Migration Approach

Determine the optimal migration approach for existing workloads

870 words

Selecting the Optimal Migration Path: AWS Migration Hub & Assessment Lab

Determine the optimal migration approach for existing workloads

1,342 words

Modernization and Upgrade Paths for AWS Workloads

Determining an application or upgrade path for new services and features

920 words

AWS Certified Solutions Architect Professional: Logging and Monitoring Strategy

Determining the most appropriate logging and monitoring strategy

925 words

Mastering Multi-Account Governance on AWS

Developing a multi-account governance model

985 words

AWS Tagging Strategy: Mapping Costs to Business Units

Developing an effective tagging strategy that maps costs to business units

820 words

Methodology for Selecting Purpose-Built AWS Services: A Strategic Study Guide

Developing a process methodology for selecting purpose-built services for required tasks

1,085 words

AWS Expenditure & Usage Awareness Strategy

Developing a strategy and implementing controls for expenditure and usage awareness

945 words

Strategic Centralization: Security Event Notifications and Auditing in AWS

Developing a strategy for centralized security event notifications and auditing

940 words

Comprehensive Attack Mitigation Strategies for Large-Scale Web Applications

Developing attack mitigation strategies for large-scale web applications

860 words

AWS Encryption Strategies: Protecting Data at Rest and in Transit

Developing encryption strategies for data at rest and data in transit

1,342 words

AWS Patch Management & Compliance Strategies

Developing strategies for patch management to remain compliant with organizational standards

985 words

Scalability Strategies: Mastering Scale-Up vs. Scale-Out for Optimal AWS Architecture

Developing the optimal architecture by considering scale-up and scale-out options

1,085 words

Mastering Disaster Recovery on AWS: Methods, Tools, and Strategies

Disaster recovery methods and tools

1,450 words

Mastering Disaster Recovery Planning: AWS SAP-C02 Study Guide

Disaster recovery planning

912 words

AWS Disaster Recovery Strategies: A Comprehensive Study Guide

Disaster recovery scenarios (for example, backup and restore, pilot light, warm standby, multi-site)

1,054 words

AWS Disaster Recovery and Business Continuity

Disaster recovery solutions on AWS

925 words

AWS Disaster Recovery: Architecting for Business Continuity

Disaster recovery strategies (for example, using AWS Elastic Disaster Recovery, pilot light, warm standby, and multi-site)

920 words

AWS Remediation Techniques and Automated Response Strategies

Employing remediation techniques

950 words

Architectural Resilience: Data Replication, Self-Healing, and Elasticity

Enabling data replication, self-healing, and elastic features and services

980 words

Mastering AWS Encryption and Certificate Management (SAP-C02)

Encryption keys and certificate management (for example, AWS Key Management Service [AWS KMS], AWS Certificate Manager [ACM])

980 words

Mastering AWS Data Encryption: At Rest and In Transit

Encryption options for data at rest and data in transit

860 words

Engineering Failure Scenarios and Recovery Exercises

Engineering failure scenario activities to support and exercise an understanding of recovery actions

925 words

AWS Migration Strategies: The 7Rs Master Study Guide

Evaluating applications according to the seven common migration strategies (7Rs)

1,150 words

Evaluating Strategies for Secure Secrets and Credentials Management

Evaluating a strategy for the secure management of secrets and credentials

890 words

Study Guide: Evaluating Connectivity Options for Multiple VPCs

Evaluating connectivity options for multiple VPCs

865 words

Hybrid Connectivity Strategies: On-Premises to AWS Integration

Evaluating connectivity options for on-premises, co-location, and cloud integration

1,185 words

Evaluating Cross-Account Access Management: SAP-C02 Study Guide

Evaluating cross-account access management

920 words

Optimizing Deployment Processes for Operational Excellence

Evaluating current deployment processes for improvement opportunities

945 words

Architectural Reliability Evaluation and Improvement

Evaluating existing architecture to determine areas that are not sufficiently reliable

1,056 words

AWS Multi-Account Governance: Evaluating and Implementing Organizational Structures

Evaluating the most appropriate account structure for organizational requirements

1,054 words

Evaluating Total Cost of Ownership (TCO) and Cost Optimization

Evaluating total cost of ownership (TCO)

1,250 words

AWS Global Services Study Guide: CloudFront, Global Accelerator, & Edge Computing

Global service offerings (for example, AWS Global Accelerator, Amazon CloudFront, edge computing services)

945 words

Governance at Scale: AWS Organizations and Control Tower

Governance tools (for example, AWS Control Tower, AWS Organizations)

890 words

Mastering High Availability and Resiliency on AWS

High availability and resiliency

945 words

High-Performing Systems Architectures: Elasticity, Fleets, and Placement Groups

High-performing systems architectures (for example, auto scaling, instance fleets, placement groups)

925 words

Architecting Hybrid DNS: Route 53 Resolver and On-Premises Integration

Hybrid DNS concepts (for example, Amazon Route 53 Resolver, on-premises DNS integration)

1,152 words

Deep Dive: AWS Identity and Access Management (IAM)

IAM

945 words

AWS SAP-C02 Study Guide: Identifying and Examining Performance Bottlenecks

Identifying and examining performance bottlenecks

945 words

Mastering AWS Pricing Models: A Comprehensive SAP-C02 Study Guide

Identifying appropriate pricing models

1,250 words

Identifying Opportunities for Purpose-Built Databases: A Modernization Guide

Identifying opportunities for purpose-built databases

1,084 words

Identifying Opportunities for Serverless Solutions: Study Guide

Identifying opportunities for serverless solutions

875 words

Identifying Opportunities to Decouple Application Components

Identifying opportunities to decouple application components

892 words

Optimizing Infrastructure: Selection & Rightsizing for Cost-Efficiency

Identifying opportunities to select and rightsize infrastructure for cost-effective resources

1,084 words

AWS Lab: Identifying and Implementing Cost Optimization Opportunities

Identify opportunities for cost optimizations

890 words

Mastering Cost Optimization: Strategies for the AWS Solutions Architect Professional

Identify opportunities for cost optimizations

940 words

Mastering AWS Identity Services: IAM Identity Center & Directory Service

Identity services (for example, AWS IAM Identity Center, AWS Directory Service)

985 words

Architectural Resiliency: Automatically Recovering from Failure

Implementing architectures to automatically recover from failure

1,085 words

Amazon Route 53 Routing Policies: A Solutions Architect's Guide

Implementing DNS routing policies (for example, Route 53 latency-based routing, geolocation routing, simple routing)

1,150 words

Mastering Loosely Coupled Dependencies for AWS Architecting

Implementing loosely coupled dependencies

925 words

Study Guide: Infrastructure as Code (IaC) and AWS CloudFormation

Infrastructure as code (IaC) (for example, AWS CloudFormation)

890 words

Mastering AWS EC2: Instance Families, Sizing, and Optimization

Instance families and use cases

965 words

Mastering Identity Federation: Integrating Third-Party IdPs with AWS

Integrating with third-party identity providers

940 words

AWS Application Integration Services: Decoupling and Orchestration

Integration services (for example, Amazon SQS, Amazon SNS, Amazon EventBridge, AWS Step Functions)

1,150 words

Mastering AWS Cost and Usage Reports (CUR) for Granular Analysis

Investigating AWS Cost and Usage Reports at a granular level

920 words

Modernizing with AWS: Delegating Development and Deployment Tasks

Making advanced technologies accessible by delegating complex development and deployment tasks to AWS

1,050 words

AWS Migration Assessment and Tracking: Mastering AWS Migration Hub

Migration assessment and tracking tools (for example, AWS Migration Hub)

925 words

AWS Monitoring and Logging Solutions: Comprehensive Study Guide

Monitoring and logging solutions (for example, Amazon CloudWatch)

925 words

Mastering AWS Cost and Usage Monitoring

Monitoring cost and usage with AWS tools

925 words

Mastering AWS Monitoring: CloudWatch and Beyond (SAP-C02 Study Guide)

Monitoring tool sets and services (for example, CloudWatch)

945 words

AWS Multi-Account Event Notifications: Architecting Centralized Observability

Multi-account event notifications

945 words

Comprehensive Study Guide: Multi-AZ and Multi-Region Architectures

Multi-AZ and multi-Region architectures

985 words

AWS Networking & Data Transfer Cost Optimization Study Guide

Networking and data transfer costs

925 words

AWS Network Segmentation and Connectivity: Architect's Study Guide

Network segmentation (for example, subnetting, IP addressing, connectivity among VPCs)

1,084 words

Mastering Network Traffic Monitoring on AWS

Network traffic monitoring

940 words

Operating and Maintaining High-Availability Architectures

Operating and maintaining high-availability architectures (for example, application failovers, database failovers)

1,050 words

Mastering Patching Practices in AWS: Strategies for Mutable and Immutable Infrastructure

Patching practices

925 words

AWS Performance Monitoring & Objectives Study Guide

Performance monitoring technologies

890 words

AWS Data Transfer Modeling and Cost Optimization

Performing data transfer modeling and selecting services to reduce data transfer costs

1,056 words

Mastering Disaster Recovery Testing: Strategy and Execution

Performing disaster recovery testing

945 words

AWS Portfolio Assessment & Migration Strategy

Portfolio assessment

820 words

Lab: Automated Remediation of Security Controls with AWS Config

Prescribe security controls

920 words

Mastering Security Controls: AWS SAP-C02 Study Guide

Prescribe security controls

1,250 words

AWS Price Model Adoptions: Reserved Instances & Savings Plans

Price model adoptions (for example, Reserved Instances, AWS Savings Plans)

925 words

Comprehensive Study Guide: AWS Pricing Models and Cost Optimization

Pricing models (for example, Reserved Instances, AWS Savings Plans)

920 words

Study Guide: Principle of Least Privilege (PoLP) in AWS

Principle of least privilege access

1,084 words

Prioritization and Migration: Wave Planning and Portfolio Assessment

Prioritization and migration of workloads (for example, wave planning)

920 words

Mastering Automated Vulnerability Response in AWS

Prioritizing automated responses to the detection of vulnerabilities

915 words

Prioritizing Automation in the AWS Solution Stack

Prioritizing opportunities for automation within a solution stack

1,050 words

Modernizing AWS Architectures: Adopting New Technologies and Managed Services

Proposing opportunities for the adoption of new technologies and managed services

985 words

Comprehensive Study Guide: Purpose-Built AWS Databases

Purpose-built databases

875 words

AWS Purpose-Built Databases: Architectural Selection and Modernization

Purpose-built databases (for example, DynamoDB, Amazon Aurora Serverless, Amazon ElastiCache)

948 words

AWS Strategy: Central Logging and Event Notifications

Recommending a strategy for central logging and event notifications

945 words

AWS Configuration Management & Automation Study Guide

Recommending the appropriate AWS solution to enable configuration management automation

985 words

Mastering Disaster Recovery Metrics: RTO and RPO

Recovery time objectives (RTOs) and recovery point objectives (RPOs)

920 words

Remediating Single Points of Failure: Architectural Strategies

Remediating single points of failure

985 words

Comprehensive Traceability of Users and Services

Reviewing comprehensive traceability of users and services

1,285 words

Reviewing Multi-Layered Security Solutions in AWS

Reviewing implemented solutions to ensure security at every layer

940 words

Mastering AWS Network Security: Route Tables, Security Groups, and NACLs

Route tables, security groups, and network ACLs

875 words

Mastering AWS Network Security: Route Tables, Security Groups, and NACLs

Route tables, security groups, and network ACLs

1,150 words

Mastering Disaster Recovery: Understanding RTO and RPO

RTOs and RPOs

945 words

AWS Scaling Methodologies: Load Balancing & Auto Scaling

Scaling methodologies (for example, load balancing, auto scaling)

985 words

AWS Secrets Management: Systems Manager & Secrets Manager

Secrets management (for example, Systems Manager, AWS Secrets Manager)

920 words

Mastering Security-Specific AWS Solutions: A Professional Study Guide

Security-specific AWS solutions

890 words

Lab: Assessing and Prioritizing Workloads with AWS Migration Hub

Select existing workloads and processes for potential migration

845 words

SAP-C02 Study Guide: Selecting Workloads for Migration

Select existing workloads and processes for potential migration

1,050 words

AWS Infrastructure Design: Region and AZ Selection for Performance

Selecting AWS Regions and Availability Zones based on network and latency requirements

1,180 words

AWS Advanced Deployment Strategies and Rollback Mechanisms

Selecting services to develop deployment strategies and implement appropriate rollback mechanisms

890 words

Selecting the Appropriate AWS Application Integration Service

Selecting the appropriate application integration service

1,180 words

Study Guide: Selecting Appropriate Application Transfer Mechanisms

Selecting the appropriate application transfer mechanism

980 words

AWS Compute Selection: Migration & Modernization Guide

Selecting the appropriate compute platform

940 words

Showing 200 of 230 study notes. View all →

Ready to practice? Jump straight in — no sign-up needed.

Take practice tests, review flashcards, and read study notes right now.

Take a Practice Test

AWS Certified Solutions Architect - Professional (SAP-C02) Practice Questions

Try 15 sample questions from a bank of 1,035. Answers and detailed explanations included.

Q1medium

A solutions architect is designing a strategy to improve security by automating the remediation of non-compliant resources. Which of the following best explains the mechanism AWS Config uses to execute these automated remediations?

AWS Config directly calls the resource's service API to revert any unauthorized changes to the last recorded 'Compliant' state in the configuration history.

AWS Config triggers an associated AWS Systems Manager (SSM) Automation runbook, which executes predefined actions to bring the resource back into compliance.

AWS Config identifies the specific IAM user who made the change and uses AWS CloudTrail to automatically deny their future 'Write' permissions for that resource.

AWS Config sends a mandatory approval request to AWS Security Hub, which must be authorized by an administrator before any configuration changes are applied.

Show answer & explanation

Correct Answer: B

AWS Config provides a remediation feature that allows you to associate an AWS Systems Manager (SSM) Automation runbook with a Config rule. When a resource is flagged as non-compliant, AWS Config can trigger the SSM runbook (either manually or automatically) to perform corrective actions, such as encrypting an unencrypted disk or disabling public access on an S3 bucket. Answer: B

Q2medium

AWS Systems Manager Run Command is a key feature for automating administrative tasks across a fleet of managed nodes. Which of the following best explains how Run Command facilitates this automation at scale while minimizing operational risk?

It requires the manual opening of inbound ports 22 (SSH) or 3389 (RDP) on all target instances to allow the Systems Manager service to push and execute administrative scripts.

It utilizes the SSM Agent on managed nodes to execute commands without inbound ports, and allows for the definition of concurrency levels and error thresholds to control fleet-wide rollouts.

It automatically creates an AWS Lambda function for each target node that uses the AWS SDK to log in and execute a set of hardcoded configuration changes.

It uses Amazon Inspector to first scan each node for vulnerabilities and then automatically applies patches using a centralized AWS CloudFormation stack for the entire fleet.

Show answer & explanation

Correct Answer: B

AWS Systems Manager Run Command is designed for secure, fleet-wide automation. It eliminates the need for maintaining open inbound ports (like 22 for SSH or 3389 for RDP) because it communicates through the SSM Agent already installed on the managed nodes. To manage risk during automation at scale, Run Command provides 'concurrency control' (which limits the number of nodes processing the command simultaneously) and 'error thresholds' (which halts the operation if a certain number or percentage of nodes fail to execute the command). Answer: B

Q3medium

A retail company currently hosts its legacy web application on a fleet of x86-based Amazon EC2 instances and a self-managed relational database on a separate EC2 instance. The IT team reports that they are spending over 20 hours a week on OS patching, manual backups, and scaling the infrastructure for holiday sales. The CTO wants a proposal that adopts managed services and new technologies to reduce this operational burden and improve performance with minimal application refactoring. Which proposal should the Solutions Architect present?

Re-architect the application into a serverless model using AWS Lambda and migrate the data to Amazon DynamoDB.

Replatform the application to Amazon Elastic Beanstalk using AWS Graviton-based instances and migrate the database to Amazon Aurora.

Implement AWS Systems Manager to automate patching and use AWS Backup to schedule snapshots for the existing EC2 instances.

Scale the existing x86-based EC2 instances vertically and implement an Amazon CloudFront distribution to cache static content.

Show answer & explanation

Correct Answer: B

To meet the goal of reducing operational overhead (managed services) and improving performance (new technologies) with minimal refactoring, Option B is the best choice. Amazon Elastic Beanstalk and Amazon Aurora are managed services that handle infrastructure provisioning, patching, and backups. Adopting AWS Graviton-based instances (C6g/M6g/R6g) introduces a new technology that provides better price-performance over traditional x86 instances. Replatforming to Beanstalk/Aurora typically requires much less code change than a full serverless refactor (Option A). While Option C improves automation, it does not shift the application to a higher-level managed hosting service. Option D addresses performance but does not reduce the operational burden of managing the underlying servers. Answer: B

Q4easy

In disaster recovery planning, what is the primary purpose of the Recovery Point Objective (RPO)?

To define the maximum acceptable duration of downtime before a service is restored.

To specify the maximum acceptable amount of data loss measured in time.

To calculate the total financial cost associated with a system failure.

To determine the number of personnel required to perform a manual failover.

Show answer & explanation

Correct Answer: B

The Recovery Point Objective (RPO) defines the amount of data a workload is allowed to lose in the event of a disaster, typically expressed as a measure of time (e.g., 15 minutes or 1 hour). It determines how far back in time data must be recoverable from the latest backup or replica. Option A describes the Recovery Time Objective (RTO), which focuses on downtime rather than data loss. Answer: B

Q5medium

A security architect is designing an automated compliance system to ensure that all Amazon S3 buckets within a production environment remain private. The system must automatically detect when a bucket is made public, remediate the configuration immediately, and notify the Security Operations Center (SOC). Which strategy best applies automated policy enforcement to meet these requirements?

Configure an AWS Config rule to monitor S3 bucket configurations, associate an AWS Systems Manager (SSM) Automation runbook as a remediation action, and use Amazon SNS for alerting.

Set up Amazon GuardDuty to monitor S3 data plane events and trigger an Amazon EventBridge rule that executes an AWS Lambda function to delete the non-compliant S3 bucket.

Implement an IAM Service Control Policy (SCP) at the Organization level that denies the s3:PutBucketPublicAccessBlock permission for all IAM users and roles.

Enable AWS CloudTrail to log all S3 API calls and create a scheduled Amazon EventBridge rule that triggers a Lambda function to scan and fix all buckets every 24 hours.

Show answer & explanation

Correct Answer: A

To implement automated policy enforcement, the most effective pattern involves using AWS Config to continuously monitor resource configurations against defined rules (e.g., s3-bucket-public-read-prohibited). When a resource is found to be non-compliant, AWS Config can automatically trigger a remediation action via AWS Systems Manager (SSM) Automation runbooks, such as AWS-DisableS3BucketPublicReadWrite. This ensures the resource is corrected near real-time. Notifications can be integrated via Amazon SNS or EventBridge to alert the SOC. Answer: A

Q6medium

A retail company operates a legacy monolithic application that processes inventory updates from suppliers via CSV files. These updates occur at irregular intervals; sometimes the system receives 100 files in an hour, while at other times it receives none for several days. Each file takes approximately 30 seconds to process. The company wants to modernize this specific process to minimize operational overhead and ensure they only pay for the compute time used. Which architectural change provides the best opportunity to leverage a serverless solution?

Refactor the processing logic into an AWS Lambda function triggered by Amazon S3 'Object Created' events.

Rehost the monolithic application on Amazon EC2 instances within an Auto Scaling group using a 'Scheduled Scaling' policy.

Deploy the inventory processing service as a Docker container on Amazon ECS using an Amazon EC2 launch type.

Migrate the processing logic to a dedicated Amazon EC2 instance using a 'Burstable Performance' instance family like T3.

Show answer & explanation

Correct Answer: A

Analyze the Workload: The processing occurs at irregular intervals (100 files vs. zero for days), which is a classic 'bursty' workload. 2. Evaluate Serverless Benefits: AWS Lambda is event-driven and scales instantly to handle spikes while costing nothing during idle time. 3. Compare Alternatives: EC2 and ECS (with EC2 launch type) require provisioning servers that would sit idle for days, increasing costs and management overhead. 4. Select Optimal Solution: Refactoring the logic to Lambda triggered by S3 events is the most effective application of serverless modernization. Answer: A

Q7easy

Which of the following best describes the primary purpose of AWS DataSync?

To provide physical hardware for shipping petabytes of data offline to AWS data centers.

To automate and accelerate the online transfer of data between on-premises storage systems and AWS services.

To create a managed endpoint for external users to upload files via SFTP, FTPS, or FTP.

To provide a local cache for on-premises applications to access data stored in Amazon S3 with low latency.

Show answer & explanation

Correct Answer: B

AWS DataSync is a versatile online data transfer service designed to simplify, automate, and accelerate the movement of data between on-premises storage (such as NFS, SMB, or HDFS) and AWS storage services like Amazon S3, Amazon EFS, and Amazon FSx. It handles tasks like encryption, data validation, and bandwidth management. Answer: B

Q8hard

A global fintech company is designing a disaster recovery (DR) strategy for its core transaction processing engine. The business requirements specify a Recovery Time Objective (RTO) of less than 10 seconds and a Recovery Point Objective (RPO) of less than 1 second. The application must serve traffic from two AWS Regions simultaneously to provide low latency and high availability. Which architecture meets these requirements while ensuring that a regional outage does not rely on client-side DNS cache expiration for failover?

Route 53 with Latency-based routing, Amazon Aurora Global Database, and Application Load Balancers (ALBs) in both regions.

Route 53 with Geoproximity routing, Amazon DynamoDB Global Tables, and Amazon EC2 instances in a Warm Standby configuration.

AWS Global Accelerator, Amazon DynamoDB Global Tables, and fully scaled-up Auto Scaling Groups (ASGs) in both regions.

AWS Global Accelerator, Amazon RDS Multi-Region Read Replicas, and Amazon ECS tasks with a Pilot Light configuration in the secondary region.

Show answer & explanation

Correct Answer: C

A Multi-site Active-Active architecture requires fully functional and scaled-up environments in both regions. For an RTO of less than 10 seconds, DNS-based failover (Route 53) is often insufficient because client-side DNS caching and TTL values can delay traffic redirection. AWS Global Accelerator provides a faster failover mechanism by using static Anycast IP addresses and routing traffic over the AWS backbone network, bypassing DNS limitations. Amazon DynamoDB Global Tables provide an active-active, multi-master database with sub-second replication, meeting the RPO requirement of less than 1 second and providing near-zero RTO. Amazon Aurora Global Database (Option A) typically requires cluster promotion during failover, which can take up to 1 minute, exceeding the 10-second RTO limit. Warm Standby (Option B) and Pilot Light (Option D) are not active-active architectures and do not have the secondary compute resources fully scaled to handle immediate production traffic. Answer: C

Q9medium

An IT administrator is reviewing a resource usage report for two production EC2 instances over a 30-day period to assess cost efficiency and performance. The report reveals the following average utilization data:

Metric	Instance A	Instance B
Average CPU Utilization	8%	92%
Peak CPU Utilization	12%	100%
Memory Utilization	15%	88%

Which of the following best explains the identification of these resources and the most appropriate right-sizing action to optimize for cost and performance?

Instance A is underutilized because its consumption is far below provisioned capacity; it should be downsized or terminated to reduce costs. Instance B is overutilized because it is nearly saturated; it should be upsized or scaled out to avoid performance issues.

Instance A is right-sized because the 12% peak provides a necessary safety buffer for growth; Instance B is underutilized because its high efficiency indicates it is perfectly matched to the workload and requires no further intervention.

Instance A is over-provisioned and requires a move to a compute-optimized instance family; Instance B is over-provisioned and requires a move to a burstable instance family like T3 to handle occasional spikes.

Instance A is underutilized due to a potential network bottleneck; Instance B is right-sized because 92% average utilization represents the most cost-effective use of cloud hardware for any production environment.

Show answer & explanation

Correct Answer: A

In cloud resource management, underutilization is identified when metrics (like CPU or Memory) consistently stay at a low percentage (e.g., $< 10\%$ to 20%), indicating the instance is larger than required. The correct action is to downsize (right-size) or terminate it to save costs. Overutilization is identified when metrics consistently reach or exceed 90% to 100%, indicating a risk of performance bottlenecks or application crashes. The correct action is to upsize the instance (vertical scaling) or scale out by adding more instances (horizontal scaling). Answer: A

Q10easy

In disaster recovery planning, which metric specifically describes the maximum acceptable amount of time that a workload can be offline before it must be restored?

Recovery Point Objective (RPO)

Recovery Time Objective (RTO)

Mean Time to Repair (MTTR)

Service Level Objective (SLO)

Show answer & explanation

Correct Answer: B

The Recovery Time Objective (RTO) is the maximum acceptable length of time that a service, application, or function can be unavailable after a disaster occurs. In contrast, the Recovery Point Objective (RPO) defines the maximum amount of data loss an organization is willing to tolerate, measured in time from the point of the disaster back to the last successful backup. Answer: B

Q11medium

A solutions architect is designing a workload that requires multiple Linux-based Amazon EC2 instances to have concurrent access to a shared file system. The instances are distributed across three Availability Zones (AZs) for high availability. Which of the following accurately explains why Amazon EFS is the most suitable storage choice compared to Amazon EBS with Multi-Attach?

Amazon EFS provides sub-millisecond block-level access across multiple regions, whereas Amazon EBS is limited to file-level access within a single region.

Amazon EFS is a regional service that supports concurrent access across multiple Availability Zones, whereas Amazon EBS Multi-Attach is restricted to a single Availability Zone.

Amazon EFS natively supports both Windows and Linux mounting using the SMB protocol, whereas Amazon EBS Multi-Attach only supports the NFS protocol.

Amazon EFS requires manual capacity provisioning to handle growth, whereas Amazon EBS Multi-Attach volumes automatically scale their storage size based on usage.

Show answer & explanation

Correct Answer: B

Amazon EFS (Elastic File System) is a managed Network File System (NFS) that is regional in scope, allowing it to be mounted simultaneously by instances in multiple Availability Zones. In contrast, Amazon EBS Multi-Attach allows a single Provisioned IOPS volume to be attached to up to 16 Linux instances, but those instances must reside within the same Availability Zone. Furthermore, EFS automatically scales its storage capacity as files are added or removed, while EBS volumes have a fixed provisioned size. Answer: B

Q12medium

A solutions architect is managing a multi-account environment using AWS Organizations. The central Networking account owns a VPC and wants to share a specific private subnet with a Development account using AWS Resource Access Manager (RAM). Which of the following describes a correct application of this configuration when a developer in the Development account attempts to launch resources?

The developer can create and manage their own security groups within the Development account to control traffic to their instances in the shared subnet.

The Development account must first accept a resource share invitation in the RAM console before the shared subnet becomes visible.

The Networking account owner is responsible for creating and managing all security groups used by the Development account's resources.

The developer in the Development account can modify the network access control list (NACL) associated with the shared subnet to allow specific traffic.

Show answer & explanation

Correct Answer: A

In AWS RAM subnet sharing, the VPC owner (Networking account) shares subnets with participants (Development account). Participants can see and use the subnets but cannot modify VPC-level resources like NACLs, Route Tables, or the VPC itself. However, participants are responsible for and have full control over their own resources, including creating their own security groups. Within an AWS Organization with sharing enabled, no invitation is required; the resource becomes available immediately. Answer: A

Q13easy

Which AWS feature allows you to capture and log information about the IP traffic going to and from network interfaces in your Virtual Private Cloud (VPC)?

AWS CloudTrail

VPC Flow Logs

AWS Config

Amazon Inspector

Show answer & explanation

Correct Answer: B

VPC Flow Logs is the standard feature for capturing IP traffic metadata in a VPC. It is commonly used for troubleshooting network connectivity, auditing security group rules, and monitoring traffic patterns. AWS CloudTrail focuses on API activity, while AWS Config tracks resource configuration changes. Answer: B

Q14easy

Which of the following is the primary benefit of implementing a cost-allocation tagging strategy to map cloud expenses to business units?

It automatically updates and patches the operating systems of all tagged resources.

It provides the granular visibility needed for internal cost chargebacks and accurate financial reporting.

It replaces the need for using AWS Organizations and consolidated billing features.

It increases the raw processing speed of compute instances assigned to specific projects.

Show answer & explanation

Correct Answer: B

As described in cloud governance best practices, cost-allocation tags are key-value pairs assigned to resources to monitor usage and expenditure granularly. By tagging resources with business unit identifiers, an organization can categorize costs and generate reports that facilitate internal chargeback processes, ensuring financial accountability across different departments. Answer: B

Q15easy

Which of the following best describes the primary use case for the AWS Snow Family of services?

Providing a dedicated 10 Gbps private network connection between on-premises data centers and AWS.

Physically migrating large volumes of data to the AWS cloud using ruggedized hardware devices.

Automatically scaling Amazon EC2 instances based on real-time CPU utilization metrics.

Hosting serverless web applications using AWS Lambda and Amazon API Gateway.

Show answer & explanation

Correct Answer: B

The AWS Snow Family (which includes AWS Snowcone, AWS Snowball, and AWS Snowmobile) is designed for offline data migration and edge computing. These services allow customers to move large datasets—ranging from terabytes to exabytes—into and out of AWS by physically shipping ruggedized devices, bypassing the limitations of internet bandwidth. Answer: B

These are 15 of 1,035 questions available. Take a practice test →

AWS Certified Solutions Architect - Professional (SAP-C02) Flashcards

824 flashcards for spaced-repetition study. Showing 30 sample cards below.

Adopting Managed Services for Reduced Overhead(4 cards shown)

Question

Immutable Infrastructure

Answer

An infrastructure management strategy where servers or resources are never modified after deployment. When a change (like a patch) is needed, the old resources are destroyed and replaced by new ones built from a common image (like an AMI).

[!TIP] Think of this as "Cattle, not Pets." Instead of healing a sick server, you replace it with a healthy one.

Question

What is the primary operational trade-off when migrating from self-managed EC2 instances to high-level managed services like AWS Lambda?

Answer

The primary trade-off is reduced operational overhead (patching, scaling, provisioning) in exchange for increased initial refactoring or rearchitecting effort.

Aspect	Self-Managed (EC2)	Managed (Lambda/Fargate)
OS Patching	Customer Responsibility	AWS Responsibility
Infrastructure Code	Manage OS, scaling, networking	Focus on application logic
Complexity	Higher operational complexity	Higher architectural complexity

[!NOTE] Managed services allow you to delegate "undifferentiated heavy lifting" to AWS.

Question

In the cloud, customers can achieve better long-term cost savings and reduced drift by moving applications to ___ services, though this typically requires more ___ of the application compared to a lift-and-shift approach.

Answer

higher-level managed (or serverless)
refactoring (or rearchitecting)

By moving to services like AWS Fargate or Lambda, you limit infrastructure configuration drift because you no longer manage the long-lived underlying virtual machines.

Question

Explain how Adopting Managed Services impacts the Shared Responsibility Model regarding patching.

Answer

As you move from IaaS (Infrastructure as Code) to PaaS/Serverless (Platform/Function as a Service), the line of responsibility moves upward.

Loading Diagram...

Figure 1 — Mermaid diagram

Key Benefit: By using managed services (e.g., Amazon RDS, AWS Fargate), the customer is no longer responsible for the underlying OS, which eliminates the need for manual patching schedules and reduces the risk of security vulnerabilities due to unpatched systems.

Alerting and Automatic Remediation Strategies(4 cards shown)

Question

AWS Config

Answer

A managed service that acts as a Configuration Management Database (CMDB) by recording and tracking AWS resource configurations.

It evaluates whether resource settings align with desired configurations through Config Rules.

[!NOTE] It is the primary tool for detecting configuration drift and compliance violations in real-time.

Question

When an AWS Config rule violation is detected, the service can trigger a remediation action using ___ ___ ___ Automation runbooks.

Answer

AWS Systems Manager (SSM)

SSM Automation runbooks define the specific steps (scripts or API calls) required to resolve a non-compliant state.

Example: Using the predefined runbook AWS-DisableS3BucketPublicReadWrite to automatically block public access when a bucket is incorrectly configured.

[!TIP] Remediation can be set to occur automatically upon detection or manually after review.

Question

How does AWS Security Hub facilitate automated remediation for security findings?

Answer

Security Hub aggregates findings and routes them to Amazon EventBridge.

The workflow involves:

Finding Generation: Security Hub identifies a vulnerability (e.g., PCI-DSS violation).
Event Routing: The finding is sent to EventBridge as an event.
Target Trigger: EventBridge rules match the finding and trigger a target, such as an AWS Lambda function or an SSM Automation document to execute the fix.

Component	Role
Security Hub	Detection & Aggregation
EventBridge	Routing & Filtering
Lambda/SSM	Execution of Remediation

Question

Automated Security Response on AWS (Playbooks)

Answer

A library of pre-built remediations for common security standards (CIS, PCI-DSS, etc.) supported by AWS Security Hub.

Loading Diagram...

Figure 1 — Mermaid diagram

Key Characteristics:

Contextual Awareness: Remediation can be adapted based on risk (e.g., notify first for dev, block immediately for prod).
Scalability: Critical for organizations with hundreds of AWS accounts where manual intervention is impossible.

[!WARNING] Always ensure your remediation logic accounts for business continuity (e.g., don't block a production bucket without proper alerting/exception handling).

Amazon Route 53 Routing Policies(4 cards shown)

Question

Latency-Based Routing (LBR)

Answer

A routing policy used when you have resources in multiple AWS Regions and want to route traffic to the region that provides the lowest network latency for the end-user.

[!NOTE] Route 53 measures latency (Round Trip Time) over time and maintains a database to determine the best region.

Loading Diagram...

Figure 1 — Mermaid diagram

Question

How does Geolocation Routing differ from Geoproximity Routing?

Answer

While both use geographic data, they serve different primary purposes:

Feature	Geolocation Routing	Geoproximity Routing
Basis	Based on the user's physical location (continent, country, or state).	Based on the geographic distance between user and resource.
Control	Allows for localized content or restricted distribution.	Uses a Bias value to expand or shrink the size of a geographic region.
Use Case	Compliance, language-specific sites.	Complex traffic shifting across global regions.

[!TIP] Remember: Geolocation = Where the user is. Geoproximity = Where the resource is + a 'bias' adjustment.

Question

Failover Routing Policy (Active-Passive)

Answer

This policy is used to configure active-passive failover, where one resource (Primary) handles traffic as long as it is healthy, and Route 53 switches to a backup resource (Secondary) if the primary fails.

Components required:

Primary Record: The main resource (e.g., an ALB in us-east-1).
Secondary Record: The disaster recovery resource (e.g., a static S3 site).
Health Check: Monitored by Route 53 to trigger the switch.

Loading Diagram...

Figure 1 — Mermaid diagram

[!WARNING] If you don't associate a health check with the primary record, Route 53 will continue to route traffic to it even if it is down.

Question

In Route 53, the ___ routing policy allows you to return up to eight healthy records of the same type (such as A records) in response to a DNS query, providing a basic form of load balancing and high availability.

Answer

Multivalue Answer Routing

Unlike Simple Routing, which returns all values regardless of health, Multivalue Answer only returns values for healthy resources.

Key Characteristics:

Returns up to 8 records.
Not a substitute for an ELB, but provides DNS-level availability.
Requires Route 53 health checks.

Analyzing AWS Usage Reports for Cost Optimization(4 cards shown)

Question

Right-sizing

Answer

The process of matching instance types and sizes to your workload performance and capacity requirements at the lowest possible cost.

[!TIP] Organizations often over-provision during "lift-and-shift" migrations to ensure performance; right-sizing is the corrective step to align resources with actual cloud performance benefits.

Question

What is the primary difference between AWS Cost Explorer and AWS Cost and Usage Reports (CUR) in terms of data granularity and delivery?

Answer

Feature	AWS Cost Explorer	AWS Cost and Usage Reports (CUR)
Granularity	Daily/Monthly (Hourly available for a fee)	Most granular (Hourly, daily, or monthly)
Delivery	AWS Management Console / API	CSV or Parquet files delivered to an S3 Bucket
Use Case	Visualizing trends and quick filtering	In-depth analysis and integration with BI tools like Amazon QuickSight

[!NOTE] CUR provides the cost breakup of resources by tags, products, and services at the most comprehensive level available.

Question

To enable AWS Cost and Usage Reports (CUR), you must apply an S3 bucket policy that grants the ___ service principal the permissions s3:GetBucketAcl and s3:GetBucketPolicy.

Answer

billingreports.amazonaws.com

This permission is required so that the AWS Billing service can verify the bucket ownership and successfully deliver the report files to your account.

json

{
  "Principal": {
    "Service": "billingreports.amazonaws.com"
  },
  "Action": [
    "s3:GetBucketAcl",
    "s3:GetBucketPolicy"
  ]
}

Question

How can an architect identify underutilized resources using the native reporting tools in AWS?

Answer

Architects can identify inefficiencies through a systematic review of usage reports:

Analyze Cost Explorer Reports: Use the "Monthly costs by service" report and filter by specific Regions or Accounts to find high-cost/low-usage anomalies.
Examine CUR Data: Investigate granular hourly data in S3 to find instances with consistently low CPU/Memory utilization.
Review Default Boilerplates: Utilize out-of-the-box reports like the Reserved Instance (RI) Report to see if committed capacity is being fully utilized.

Loading Diagram...

Figure 1 — Mermaid diagram

Applying Design Patterns for Performance: Caching, Buffering, and Replicas(4 cards shown)

Question

Cache-Aside (Lazy Loading)

Answer

A caching strategy where the application is responsible for managing the cache.

Process:

Check the cache.
If Cache Hit: Return data.
If Cache Miss: Query database, update cache, then return data.

[!TIP] This pattern is highly effective for read-heavy workloads where data is not frequently updated, but it can result in stale data if the database is updated without invalidating the cache.

Question

How do Read Replicas differ from a Caching Layer (like ElastiCache) in addressing read performance bottlenecks?

Answer

While both offload reads from the primary database, they serve different performance profiles:

Feature	Read Replicas (e.g., RDS)	Caching Layer (e.g., ElastiCache)
Latency	Milliseconds	Sub-millisecond (In-memory)
Query Capability	Supports complex SQL queries	Key-Value or simple data structures
Data Freshness	Asynchronous replication (Lag)	Depends on pattern (Write-through vs. Lazy)
Use Case	Scaling analytical/complex reads	Offloading frequent simple lookups

[!NOTE] Read replicas reduce the load on the primary DB instance, whereas caching significantly reduces the response time for individual requests.

Question

To achieve transparent, in-memory acceleration for Amazon DynamoDB without modifying complex application logic, an architect should use ___.

Answer

DynamoDB Accelerator (DAX)

DAX is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement—from milliseconds to microseconds—even at millions of requests per second.

[!TIP] Use DAX for eventually consistent read-intensive workloads where you don't want to manage cache-aside logic in your application code.

Question

Write-Through Caching Pattern

Answer

In this pattern, data is written to the cache and the backend database simultaneously.

Loading Diagram...

Figure 1 — Mermaid diagram

Pros:

Data in the cache is always up-to-date with the database.
Simple read logic (reads always hit the cache first).

Cons:

Write Penalty: Higher latency for write operations because they must hit two systems.
Cache Churn: May populate the cache with data that is rarely read, wasting memory.

[!WARNING] Use this pattern when data consistency between the cache and the database is critical.

Architecting AWS Network Connectivity Strategies(4 cards shown)

Question

AWS Transit Gateway (TGW)

Answer

A fully managed network hub used to interconnect Virtual Private Clouds (VPCs) and on-premises networks.

[!TIP] Think of it as a "Cloud Router" that simplifies network topology by replacing complex peering meshes with a hub-and-spoke model.

Key Benefits:

Transitive Routing: Easily connect multiple VPCs through a single hub.
Centralized Management: Simplify edge connectivity to on-premises via VPN or Direct Connect.
Scalability: Supports thousands of VPCs and massive throughput.

Question

How can you increase the aggregate bandwidth of AWS Site-to-Site VPN connections beyond the standard 1.25 Gbps limit?

Answer

By using Equal Cost Multi-Path (ECMP) routing on an AWS Transit Gateway (TGW).

How it works:

Establish multiple VPN tunnels between your on-premises customer gateway and the TGW.
Enable ECMP on the Transit Gateway.
The TGW will load balance traffic across the multiple tunnels.

Bandwidth Calculation: $\text{Total Bandwidth} = \text{Number of Tunnels} \times 1.25\text{ Gbps}$

[!NOTE] Your on-premises router must also support ECMP to utilize the full aggregate bandwidth for outbound traffic.

Question

When planning VPC subnets, AWS reserves ___ IP addresses in every CIDR block for internal use.

Answer

AWS reserves the following addresses in every subnet:

x.x.x.0: Network address.
x.x.x.1: Reserved by AWS for the VPC router.
x.x.x.2: Reserved by AWS for mapping to the Amazon Provided DNS.
x.x.x.3: Reserved by AWS for future use.
x.x.x.255: Network broadcast address (AWS does not support broadcast, but the address is reserved).

[!WARNING] Always account for these 5 addresses when calculating the required size for your subnets (e.g., a /28 subnet has 16 addresses but only 11 are usable).

Question

Hybrid Connectivity: DX vs. VPN Failover Strategy

Answer

To balance cost and reliability, architects often use a combination of AWS Direct Connect (DX) and VPN.

Connection Type	Primary Use Case	Performance	Cost
Direct Connect (DX)	Primary link for heavy workloads	High/Consistent	Higher
Site-to-Site VPN	Backup/Failover via Public Internet	Variable	Lower

Loading Diagram...

Figure 1 — Mermaid diagram

[!TIP] This is considered a "two-way door" decision because you can start with VPN and migrate to DX as traffic grows.

Assessing Solutions and Rightsizing for AWS(4 cards shown)

Question

Rightsizing

Answer

The process of matching instance types and sizes to your workload performance and capacity requirements at the lowest possible cost.

[!NOTE] Rightsizing is a continuous process. It should be performed both before migration (using on-premises metrics) and after migration (using cloud-native monitoring).

Key Goals:

Maximize performance efficiency
Minimize unnecessary expenditure
Eliminate idle or under-provisioned resources

Question

Which five AWS resources are currently supported by AWS Compute Optimizer for machine learning-based rightsizing recommendations?

Answer

AWS Compute Optimizer analyzes utilization metrics to provide recommendations for the following:

Amazon EC2 instances
Amazon EC2 Auto Scaling groups
Amazon EBS volumes
AWS Lambda functions
Amazon ECS services on AWS Fargate

Loading Diagram...

Figure 1 — Mermaid diagram

Question

Compare the rightsizing approach: Pre-migration vs. Post-migration.

Answer

Phase	Data Sources	Goal
Pre-migration	VMware vSphere, Microsoft Hyper-V metrics	Map on-premises workloads to the correct AWS instance family (e.g., Compute, Memory Optimized).
Post-migration	CloudWatch, Trusted Advisor, Cost Explorer, Compute Optimizer	Continuous optimization based on real-time AWS usage patterns and ML recommendations.

[!TIP] Use Compute Optimizer to automate the analysis of over-provisioned and under-provisioned resources once the workload is in the cloud.

Question

Applications with a ___ consumption pattern are ideal for Reserved Instances or Savings Plans, while those with ___ requirements are better suited for Spot Instances.

Answer

steady state ; spiky/variable

Reasoning:

Steady State: Committing to 1–3 years provides deep discounts (up to 72%) via RIs or Savings Plans.
Spiky/Fault-Tolerant: Spot Instances utilize spare AWS capacity for up to 90% savings but can be reclaimed by AWS with a 2-minute warning.

[!WARNING] Never use Spot Instances for workloads that cannot handle interruptions unless a robust stateless architecture is in place.

Asset Planning and Workload Migration (AWS SAP-C02)(2 cards shown)

Question

The 7Rs of Migration

Answer

The seven common migration strategies for moving workloads to the cloud:

Strategy	Action	Description
Retire	Decommission	Turn off applications no longer needed.
Retain	Keep	Leave apps on-premises (compliance/latency).
Rehost	Lift and Shift	Move to cloud without changes (EC2).
Relocate	Transfer	Move VMware/containers without new hardware.
Repurchase	Drop and Shop	Switch to a SaaS version (e.g., Salesforce).
Replatform	Lift and Reshape	Minor optimization (e.g., move to Amazon RDS).
Re-architect	Refactor	Full redesign to be cloud-native (Lambda/S3).

[!NOTE] Re-architecting offers the highest ROI but involves the most complexity and cost.

Question

What is the primary function of AWS Migration Hub in the context of portfolio assessment and asset planning?

Answer

AWS Migration Hub provides a centralized location to discover, plan, and track migrations across multiple AWS and partner tools.

Key Capabilities:

Discovery: Collects inventory and utilization data from on-premises servers via Discovery Agents or Connectors.
Planning: Organizes servers into applications and helps determine the best migration strategy.
Tracking: Monitors the status of migrations regardless of which tool (e.g., Application Migration Service, Database Migration Service) is being used.

[!TIP] It is the "Single Pane of Glass" for migration visibility.

Showing 30 of 824 flashcards. Study all flashcards →

Ready to ace AWS Certified Solutions Architect - Professional (SAP-C02)?

Access all 1,035 practice questions, 12 timed mock exams, study notes, and flashcards — no sign-up required.

Start Studying — Free

Free AWS Certified Solutions Architect - Professional (SAP-C02) Study Resources

On This Page

AWS Certified Solutions Architect - Professional (SAP-C02) Study Notes & Guides

Optimizing Operations: Adopting Managed Services & Reducing Infrastructure Overhead

Optimizing Operations: Adopting Managed Services & Reducing Infrastructure Overhead

Learning Objectives

Key Terms & Glossary

The "Big Idea"

Formula / Concept Box

Hierarchical Outline

Visual Anchors

Infrastructure Evolution Flow

The Shared Responsibility Boundary

Definition-Example Pairs

Worked Examples

Scenario: Migrating a Legacy Web App to Reduce Patching

Checkpoint Questions

Muddy Points & Cross-Refs

Comparison Tables

Study Guide: Alerting and Automatic Remediation Strategies

Alerting and Automatic Remediation Strategies

Learning Objectives

Key Terms & Glossary

The "Big Idea"

Formula / Concept Box

Hierarchical Outline

Visual Anchors

Incident Response Flowchart

Resource Monitoring State Diagram

Definition-Example Pairs

Worked Examples

Case: Automating S3 Public Access Block

Checkpoint Questions

Muddy Points & Cross-Refs

Comparison Tables

AWS Usage Analysis & Resource Optimization Study Guide

AWS Usage Analysis & Resource Optimization Study Guide

Learning Objectives

Key Terms & Glossary

The "Big Idea"

Formula / Concept Box

Hierarchical Outline

Visual Anchors

The Optimization Lifecycle

Cost-Performance Trade-off

Definition-Example Pairs

Worked Examples

Analyzing a CUR for EC2 Instances

Checkpoint Questions

Muddy Points & Cross-Refs

Comparison Tables

AWS Application Integration: Architecting for Decoupling and Resiliency

AWS Application Integration: Architecting for Decoupling and Resiliency

Learning Objectives

Key Terms & Glossary

The "Big Idea"

Formula / Concept Box

Hierarchical Outline

Visual Anchors

The Fan-out Pattern

SQS Queue Structure

Definition-Example Pairs

Worked Examples

Scenario: Modernizing a Monolithic Order System

Checkpoint Questions

Muddy Points & Cross-Refs

Comparison Tables

Orchestration (Step Functions) vs. Choreography (EventBridge)

Mastering AWS Application Migration Tools: SAP-C02 Study Guide

Mastering AWS Application Migration Tools

Learning Objectives

Key Terms & Glossary

The "Big Idea"

Formula / Concept Box

Hierarchical Outline

Visual Anchors

The Migration Workflow

MGN Architecture Detail

Definition-Example Pairs

Worked Examples