Study Guide842 words

Data Access Patterns for Multi-Tenant Applications

Implement data access patterns for multi-tenant applications

Multi-Tenant Data Access Patterns in AWS

This guide explores the architectural patterns and security mechanisms used to manage data for multiple customers (tenants) within a single application environment, a critical skill for the AWS Certified Developer - Associate exam (Skill 2.3.6).

Learning Objectives

After studying this guide, you should be able to:

  • Differentiate between Silo, Bridge, and Pool models for multi-tenancy.
  • Implement tenant isolation using IAM Policy Variables.
  • Configure DynamoDB Fine-Grained Access Control (FGAC) for row-level security.
  • Select the appropriate storage pattern based on compliance and cost requirements.

Key Terms & Glossary

  • Multi-tenancy: A software architecture where a single instance of software runs on a server and serves multiple tenants.
  • Tenant Isolation: The security practice of ensuring one tenant cannot access or modify another tenant's data.
  • Noisy Neighbor: A scenario where one tenant consumes a disproportionate amount of shared resources (e.g., IOPS), impacting others.
  • FGAC (Fine-Grained Access Control): Security controls that restrict access to specific items or attributes within a database table.
  • Policy Variables: Dynamic placeholders in IAM policies (e.g., ${cognito-identity.amazonaws.com:sub}) used to create reusable, tenant-aware permissions.

The "Big Idea"

In a multi-tenant world, the developer's primary challenge is the trade-off between Isolation and Efficiency. While giving every customer their own database (Silo) is the most secure, it is expensive and hard to manage. Using a single shared table (Pool) is highly efficient but requires rigorous application-level or IAM-level logic to prevent "data leakage." The goal is to move the security boundary as close to the data as possible.

Formula / Concept Box

PatternStorage StrategyIsolation StrengthCost & Complexity
SiloDedicated resources (e.g., 1 DB per tenant)Highest (Physical)High (Resource waste)
BridgeShared DB, separate Schemas/TablesMedium (Logical)Moderate
PoolShared Table (Partitioned by tenant_id)Lowest (Software-defined)Low (Highest Scale)

Visual Anchors

Choosing a Multi-Tenant Pattern

Loading Diagram...

Logical Data Isolation (TikZ)

\begin{tikzpicture}[node distance=1.5cm, every node/.style={fill=white, font=\footnotesize}, align=center] % Table Outline \draw[thick] (0,0) rectangle (6,4); \draw[thick] (0,3) -- (6,3); \node at (3,3.5) {\textbf{Shared DynamoDB Table}};

% Rows for Tenant A \draw[fill=blue!10] (0.2,2.1) rectangle (5.8,2.8); \node at (3,2.45) {Tenant A Data (PK: TenantA_ID_1)};

% Separation Wall \draw[red, dashed, ultra thick] (0,2) -- (6,2); \node[text=red] at (7.5,2) {IAM Policy Boundary};

% Rows for Tenant B \draw[fill=green!10] (0.2,1.1) rectangle (5.8,1.8); \node at (3,1.45) {Tenant B Data (PK: TenantB_ID_1)};

% Logic \draw[->, thick] (-1.5,2.45) -- (0.2,2.45) node[midway, left] {App User A}; \draw[->, thick] (-1.5,1.45) -- (0.2,1.45) node[midway, left] {App User B}; \end{tikzpicture}

Hierarchical Outline

  • I. Storage Isolation Patterns
    • Silo Model: Physical isolation. Example: One S3 Bucket per customer. Best for PII/PHI compliance.
    • Bridge Model: Shared instance, separate logical schemas. Best for RDS (PostgreSQL/MySQL) environments.
    • Pool Model: Shared everything. Data is distinguished by a TenantID attribute. Best for DynamoDB and SaaS scaling.
  • II. Implementing IAM-Based Isolation
    • IAM Policy Variables: Allows one policy to be applied to many users. The variable ${aws:PrincipalTag/TenantID} is resolved at runtime.
    • DynamoDB Leading Keys: Using the dynamodb:LeadingKeys condition to restrict users to only items where the Partition Key matches their ID.
  • III. Integration with Identity Providers
    • Amazon Cognito: Use the sub (Subject) claim in the JWT token as the TenantID for data partitioning.
    • Stitching Identity: Mapping Cognito identity IDs to IAM roles to enable temporary credential access to tenant data.

Definition-Example Pairs

  • Leading Keys: An IAM condition key that restricts access to DynamoDB items based on the value of the partition key.
    • Example: A banking app where a user can only GetItem if the Partition Key exactly matches their User_ID stored in their IAM session.
  • Prefix-based Isolation: Organizing data in S3 using folder-like prefixes representing tenants.
    • Example: s3://my-app-data/customer-alpha/report.pdf vs s3://my-app-data/customer-beta/report.pdf.

Worked Examples

Scenario: Restricting Access to S3 by Tenant

Problem: You have a single S3 bucket for all tenants. You need an IAM policy that ensures a tenant can only access their specific folder.

Solution: Use a policy variable based on the Cognito identity ID.

json
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["s3:GetObject", "s3:PutObject"], "Resource": [ "arn:aws:s3:::my-tenant-bucket/${cognito-identity.amazonaws.com:sub}/*" ] } ] }

Explanation: When the user authenticates via Cognito, the ${cognito-identity.amazonaws.com:sub} variable is replaced by their unique ID (e.g., us-east-1:a1b2c3d4). They can only access files within my-tenant-bucket/us-east-1:a1b2c3d4/.

Checkpoint Questions

  1. Which multi-tenant pattern is most susceptible to the "Noisy Neighbor" effect?
  2. What IAM condition key is used to implement row-level security in DynamoDB?
  3. Why would a company choose a Silo pattern over a Pool pattern despite higher costs?
  4. In an S3 prefix-based isolation strategy, how do you prevent one tenant from listing all folders in the bucket?
  5. True/False: IAM Policy Variables can be used to dynamically resolve the TenantID for RDS SQL queries automatically.

Ready to study AWS Certified Developer - Associate (DVA-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free