Optimizing Performance: Sizes, Speeds, and Business Requirements

Performance efficiency in the cloud is not a static destination but a continuous cycle of matching resource sizes and speeds to the shifting needs of a business. This guide focuses on characterizing workloads, selecting the right compute and storage classes, and understanding the metrics that define success in AWS architectures.

Learning Objectives

Characterize workloads as compute, storage, or memory-driven to inform architectural choices.
Evaluate business requirements using RTO (Recovery Time Objective) and RPO (Recovery Point Objective) metrics.
Calculate required IOPS and throughput for storage volumes based on database engine page sizes.
Differentiate between vertical and horizontal scaling and their impact on system performance.
Identify networking limits for both internal AWS traffic and on-premises connectivity.

Key Terms & Glossary

IOPS (Input/Output Operations Per Second): A measurement of the number of read and write operations a storage device can perform per second.
Throughput: The amount of data (usually in MB/s or Gbps) that can be transferred from one place to another in a given time.
RTO (Recovery Time Objective): The maximum acceptable delay between a service interruption and restoration of service (measured in time).
RPO (Recovery Point Objective): The maximum acceptable amount of data loss measured in time (e.g., "we can lose up to 15 minutes of data").
Vertical Scaling: Increasing the capacity of a single resource, such as upgrading an EC2 instance to a larger size.
Horizontal Scaling: Increasing capacity by adding more resources, such as adding more EC2 instances to an Auto Scaling group.

The "Big Idea"

[!IMPORTANT] Success in AWS performance design hinges on right-sizing. Over-provisioning leads to wasted cost, while under-provisioning leads to bottlenecks. By monitoring workload metrics and using the elasticity of the cloud, architects can move away from "guessing" capacity and move toward data-driven, automated scaling that meets business-defined performance goals.

Formula / Concept Box

Concept	Formula / Rule	Notes
Required IOPS	$\text{IOPS} = \frac{\text{Required Throughput (KB/s)}}{\text{Page Size (KB)}}$	Page sizes vary (MySQL=16KB, Oracle=8KB)
gp2 Baseline IOPS	$ $3 \times$ Storage Size in GB$	Minimum 100, Maximum 16,000 IOPS
Burst Duration	$\frac{Credit Balance}{3000 - (3 \times Storage Size)}$	For gp2 volumes under 1 TB
VPN Throughput	Max 1.25 Gbps	Per VPN tunnel
Direct Network	Up to 200 Gbps	Private AWS internal networking

Hierarchical Outline

Workload Characterization
- Compute-Oriented: High CPU needs (e.g., batch processing).
- Memory-Driven: High RAM needs (e.g., in-memory databases, caching).
- Storage-Focused: High I/O or throughput needs (e.g., data warehousing).
Compute & Networking Optimization
- Vertical Scaling: Resizing instances to change CPU/RAM/Network specs.
- Horizontal Scaling: Adding instances via Parallelism (SQS, Read Replicas).
- Networking: AWS internal speeds (200 Gbps) vs. On-premises VPN (1.25 Gbps).
Storage Performance (RDS & EBS)
- IOPS vs. Throughput: Understanding the inverse relationship between page size and IOPS.
- Storage Types: General Purpose SSD (gp2) vs. Provisioned IOPS (io1/io2).
- The Nitro System: New generation instance classes (m6i, r5) that offload virtualization to hardware.

Visual Anchors

Scaling Strategy Decision Tree

Loading Diagram...

RPO and RTO Visualized

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Definition-Example Pairs

Parallelism: Designing systems so many tasks run simultaneously rather than one after another.
- Example: Using Amazon SQS to decouple an application, allowing multiple worker instances to process messages independently from a queue.
Nitro System: A collection of dedicated hardware and a lightweight hypervisor that delivers high performance and security for EC2 instances.
- Example: Moving from a db.m4 to a db.m6i instance to leverage the Nitro System for higher network bandwidth (up to 40 Gbps) and disk throughput.
Burst Balance: A credit system that allows volumes to temporarily exceed their baseline performance.
- Example: A 200 GB gp2 volume with a baseline of 600 IOPS can burst to 3,000 IOPS for approximately 37.5 minutes while the credit balance lasts.

Worked Examples

Example 1: Calculating IOPS for Throughput

Scenario: A MariaDB database (16 KB page size) requires a disk throughput of 100 MB/s. How many IOPS must the storage volume support?

Convert Throughput to KB/s: $100 \text{ MB/s} \times 1024 = 102,400 \text{ KB/s}$.
Divide by Page Size: $102,400 KB $/s \div 16$ KB = 6,400 IOPS$.
Conclusion: The architect must choose an EBS volume or RDS storage configuration that supports at least 6,400 IOPS.

Example 2: Sizing gp2 for Baseline Performance

Scenario: A company needs 1,200 IOPS consistently without relying on burst credits. What is the minimum size of a gp2 volume required?

Use the 3 IOPS per GB Rule: $Size$ = \frac{1200 IOPS}{3 IOPS/GB}$$.
Calculation: $400 \text{ GB}$.
Conclusion: A 400 GB gp2 volume will provide a steady 1,200 IOPS baseline.

Checkpoint Questions

What is the maximum network throughput for a private VPN connection to AWS?
If an Oracle database uses an 8 KB page size, how many I/O operations occur when writing 16 KB of data?
Which metric defines the amount of data loss a business can tolerate: RTO or RPO?
What is the minimum storage size for an Amazon RDS volume?
How does horizontal scaling differ from vertical scaling in terms of resource management?

▶Click to see answers

1.25 Gbps.
Two I/O operations (16 KB / 8 KB = 2).
RPO (Recovery Point Objective).
20 GB.
Horizontal scaling adds more instances (e.g., Auto Scaling), while vertical scaling increases the size of an existing instance (e.g., moving from m5.large to m5.xlarge).