Data Access Patterns for Multi-Tenant Applications
Implement data access patterns for multi-tenant applications
Multi-Tenant Data Access Patterns in AWS
This guide explores the architectural patterns and security mechanisms used to manage data for multiple customers (tenants) within a single application environment, a critical skill for the AWS Certified Developer - Associate exam (Skill 2.3.6).
Learning Objectives
After studying this guide, you should be able to:
- Differentiate between Silo, Bridge, and Pool models for multi-tenancy.
- Implement tenant isolation using IAM Policy Variables.
- Configure DynamoDB Fine-Grained Access Control (FGAC) for row-level security.
- Select the appropriate storage pattern based on compliance and cost requirements.
Key Terms & Glossary
- Multi-tenancy: A software architecture where a single instance of software runs on a server and serves multiple tenants.
- Tenant Isolation: The security practice of ensuring one tenant cannot access or modify another tenant's data.
- Noisy Neighbor: A scenario where one tenant consumes a disproportionate amount of shared resources (e.g., IOPS), impacting others.
- FGAC (Fine-Grained Access Control): Security controls that restrict access to specific items or attributes within a database table.
- Policy Variables: Dynamic placeholders in IAM policies (e.g.,
${cognito-identity.amazonaws.com:sub}) used to create reusable, tenant-aware permissions.
The "Big Idea"
In a multi-tenant world, the developer's primary challenge is the trade-off between Isolation and Efficiency. While giving every customer their own database (Silo) is the most secure, it is expensive and hard to manage. Using a single shared table (Pool) is highly efficient but requires rigorous application-level or IAM-level logic to prevent "data leakage." The goal is to move the security boundary as close to the data as possible.
Formula / Concept Box
| Pattern | Storage Strategy | Isolation Strength | Cost & Complexity |
|---|---|---|---|
| Silo | Dedicated resources (e.g., 1 DB per tenant) | Highest (Physical) | High (Resource waste) |
| Bridge | Shared DB, separate Schemas/Tables | Medium (Logical) | Moderate |
| Pool | Shared Table (Partitioned by tenant_id) | Lowest (Software-defined) | Low (Highest Scale) |
Visual Anchors
Choosing a Multi-Tenant Pattern
Logical Data Isolation (TikZ)
\begin{tikzpicture}[node distance=1.5cm, every node/.style={fill=white, font=\footnotesize}, align=center] % Table Outline \draw[thick] (0,0) rectangle (6,4); \draw[thick] (0,3) -- (6,3); \node at (3,3.5) {\textbf{Shared DynamoDB Table}};
% Rows for Tenant A \draw[fill=blue!10] (0.2,2.1) rectangle (5.8,2.8); \node at (3,2.45) {Tenant A Data (PK: TenantA_ID_1)};
% Separation Wall \draw[red, dashed, ultra thick] (0,2) -- (6,2); \node[text=red] at (7.5,2) {IAM Policy Boundary};
% Rows for Tenant B \draw[fill=green!10] (0.2,1.1) rectangle (5.8,1.8); \node at (3,1.45) {Tenant B Data (PK: TenantB_ID_1)};
% Logic \draw[->, thick] (-1.5,2.45) -- (0.2,2.45) node[midway, left] {App User A}; \draw[->, thick] (-1.5,1.45) -- (0.2,1.45) node[midway, left] {App User B}; \end{tikzpicture}
Hierarchical Outline
- I. Storage Isolation Patterns
- Silo Model: Physical isolation. Example: One S3 Bucket per customer. Best for PII/PHI compliance.
- Bridge Model: Shared instance, separate logical schemas. Best for RDS (PostgreSQL/MySQL) environments.
- Pool Model: Shared everything. Data is distinguished by a
TenantIDattribute. Best for DynamoDB and SaaS scaling.
- II. Implementing IAM-Based Isolation
- IAM Policy Variables: Allows one policy to be applied to many users. The variable
${aws:PrincipalTag/TenantID}is resolved at runtime. - DynamoDB Leading Keys: Using the
dynamodb:LeadingKeyscondition to restrict users to only items where the Partition Key matches their ID.
- IAM Policy Variables: Allows one policy to be applied to many users. The variable
- III. Integration with Identity Providers
- Amazon Cognito: Use the
sub(Subject) claim in the JWT token as theTenantIDfor data partitioning. - Stitching Identity: Mapping Cognito identity IDs to IAM roles to enable temporary credential access to tenant data.
- Amazon Cognito: Use the
Definition-Example Pairs
- Leading Keys: An IAM condition key that restricts access to DynamoDB items based on the value of the partition key.
- Example: A banking app where a user can only
GetItemif the Partition Key exactly matches theirUser_IDstored in their IAM session.
- Example: A banking app where a user can only
- Prefix-based Isolation: Organizing data in S3 using folder-like prefixes representing tenants.
- Example:
s3://my-app-data/customer-alpha/report.pdfvss3://my-app-data/customer-beta/report.pdf.
- Example:
Worked Examples
Scenario: Restricting Access to S3 by Tenant
Problem: You have a single S3 bucket for all tenants. You need an IAM policy that ensures a tenant can only access their specific folder.
Solution: Use a policy variable based on the Cognito identity ID.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": [
"arn:aws:s3:::my-tenant-bucket/${cognito-identity.amazonaws.com:sub}/*"
]
}
]
}Explanation: When the user authenticates via Cognito, the ${cognito-identity.amazonaws.com:sub} variable is replaced by their unique ID (e.g., us-east-1:a1b2c3d4). They can only access files within my-tenant-bucket/us-east-1:a1b2c3d4/.
Checkpoint Questions
- Which multi-tenant pattern is most susceptible to the "Noisy Neighbor" effect?
- What IAM condition key is used to implement row-level security in DynamoDB?
- Why would a company choose a Silo pattern over a Pool pattern despite higher costs?
- In an S3 prefix-based isolation strategy, how do you prevent one tenant from listing all folders in the bucket?
- True/False: IAM Policy Variables can be used to dynamically resolve the
TenantIDfor RDS SQL queries automatically.