Hands-On Lab880 words

Lab: Building a Scalable Data Store with Amazon DynamoDB and S3

Use data stores in application development

Lab: Building a Scalable Data Store with Amazon DynamoDB and S3

This lab provides hands-on experience in implementing data stores for application development, a core requirement for the AWS Certified Developer - Associate (DVA-C02) exam. You will configure a DynamoDB table, explore the performance differences between Query and Scan operations, and utilize S3 for object storage.

[!WARNING] Remember to run the teardown commands at the end of this lab to avoid ongoing charges. While these services are Free Tier eligible, costs can accrue if resources are left running.

Prerequisites

  • An active AWS Account.
  • AWS CLI installed and configured with appropriate permissions (AdministratorAccess recommended for lab environments).
  • Basic familiarity with JSON and command-line interfaces.
  • <YOUR_REGION>: Use a consistent region throughout (e.g., us-east-1).

Learning Objectives

  • Create and configure an Amazon DynamoDB table with optimized Partition and Sort keys.
  • Differentiate between Query and Scan operations in a live environment.
  • Implement Amazon S3 for static data storage and retrieval.
  • Understand the impact of consistency models (Eventually vs. Strongly Consistent).

Architecture Overview

Loading Diagram...

Step-by-Step Instructions

Step 1: Create an Amazon S3 Bucket for Application Assets

In this step, you will create an S3 bucket to store application-related metadata or static assets, mimicking a real-world serverless frontend storage.

bash
# Replace <UNIQUE_SUFFIX> with your name or a random string aws s3 mb s3://brainybee-lab-assets-<UNIQUE_SUFFIX> --region <YOUR_REGION>
Console alternative
  1. Navigate to S3 in the AWS Console.
  2. Click Create bucket.
  3. Enter a unique name: brainybee-lab-assets-<UNIQUE_SUFFIX>.
  4. Select your preferred Region and click Create bucket (keep default settings).

Step 2: Create a DynamoDB Table for "Todo" Tasks

We will create a table for a Todo application. To ensure high-cardinality and efficient access, we will use UserId as the Partition Key and TaskId as the Sort Key.

bash
aws dynamodb create-table \ --table-name TodoTable \ --attribute-definitions \ AttributeName=UserId,AttributeType=S \ AttributeName=TaskId,AttributeType=S \ --key-schema \ AttributeName=UserId,KeyType=HASH \ AttributeName=TaskId,KeyType=RANGE \ --provisioned-throughput \ ReadCapacityUnits=5,WriteCapacityUnits=5 \ --region <YOUR_REGION>

[!TIP] Choosing a high-cardinality Partition Key (like UserId) ensures that data is distributed evenly across multiple physical partitions, preventing "hot partitions."

Step 3: Populate the Table (Data Serialization)

We will insert a few items into the table using the put-item command. Note how we specify the data types (S for String).

bash
aws dynamodb put-item \ --table-name TodoTable \ --item '{"UserId": {"S": "user_123"}, "TaskId": {"S": "T-001"}, "TaskName": {"S": "Complete AWS Lab"}, "Status": {"S": "In-Progress"}}' \ --region <YOUR_REGION> aws dynamodb put-item \ --table-name TodoTable \ --item '{"UserId": {"S": "user_123"}, "TaskId": {"S": "T-002"}, "TaskName": {"S": "Prepare for DVA-C02"}, "Status": {"S": "Pending"}}' \ --region <YOUR_REGION>

Step 4: Compare Query vs. Scan

A Query finds items based on primary key values, while a Scan examines every item in the table.

Perform a Query (Efficient):

bash
aws dynamodb query \ --table-name TodoTable \ --key-condition-expression "UserId = :v1" \ --expression-attribute-values '{":v1": {"S": "user_123"}}' \ --region <YOUR_REGION>

Perform a Scan (Expensive):

bash
aws dynamodb scan \ --table-name TodoTable \ --region <YOUR_REGION>

Checkpoints

  1. S3 Check: Run aws s3 ls. Do you see your bucket listed?
  2. DynamoDB Check: Run aws dynamodb describe-table --table-name TodoTable. Is the TableStatus marked as ACTIVE?
  3. Consistency Check: In your Query output, note the ScannedCount vs Count. In a Query for a specific user, these should be low. In a Scan, ScannedCount will equal the total items in the table.

Teardown

To avoid costs, delete all resources created during this lab.

bash
# 1. Delete the DynamoDB Table aws dynamodb delete-table --table-name TodoTable --region <YOUR_REGION> # 2. Empty and Delete the S3 Bucket aws s3 rm s3://brainybee-lab-assets-<UNIQUE_SUFFIX> --recursive aws s3 rb s3://brainybee-lab-assets-<UNIQUE_SUFFIX>

Troubleshooting

ErrorLikely CauseFix
ResourceNotFoundExceptionTable/Bucket is in a different region.Add --region <YOUR_REGION> explicitly to the command.
AccessDeniedIAM User lacks DynamoDB/S3 permissions.Attach AmazonDynamoDBFullAccess or check IAM policies.
ValidationExceptionIncorrect JSON syntax in the --item flag.Ensure quotes are escaped correctly or use a JSON file.

Stretch Challenge

Objective: Implement a Global Secondary Index (GSI).

Currently, you can only efficiently search by UserId. Add a GSI to the TodoTable that allows you to search for tasks by Status.

Show Hint

Use the update-table command with --attribute-definitions and --global-secondary-index-updates. This allows you to perform Query operations on non-key attributes.

Cost Estimate

  • Amazon S3: Standard storage is $0.023 per GB (First 5GB/month free). Lab usage: $0.00.
  • Amazon DynamoDB: 25 GB of storage and 25 WCU/RCU are free. Lab usage: $0.00.
  • Total Estimated Lab Cost: $0.00 (within Free Tier limits).

Concept Review

Data Store Comparison Table

FeatureAmazon S3Amazon DynamoDBAmazon RDS
TypeObject StorageNoSQL (Key-Value/Document)Relational (SQL)
Best Use CaseStatic files, backups, logsHigh-speed, high-scale appsComplex joins, transactions
ScalabilityVirtually unlimitedProvisioned or On-DemandVertical & Horizontal (Read Replicas)
ConsistencyStrong (since late 2020)Eventual (Default) / StrongStrong

Key DVA-C02 Concept: Consistency Models

\begin{tikzpicture}[node distance=2cm] \node (start) [draw, rectangle] {Read Request}; \node (eventual) [draw, rounded corners, right of=start, xshift=3cm] {\textbf{Eventual Consistency}}; \node (strong) [draw, rounded corners, below of=eventual, yshift=-1cm] {\textbf{Strong Consistency}};

\draw [->] (start) -- node[anchor=south] {Default} (eventual); \draw [->] (start) |- node[anchor=west, yshift=0.5cm] {ConsistentRead=true} (strong);

\node [below of=eventual, xshift=1.5cm, yshift=1.2cm, text width=4cm] {\tiny Potential stale data, lower cost.}; \node [below of=strong, xshift=1.5cm, yshift=1.2cm, text width=4cm] {\tiny Most recent data, double RCU cost.}; \end{tikzpicture}

By default, DynamoDB uses Eventually Consistent Reads. If your application requires the absolute latest data immediately after a write, you must set ConsistentRead to true, which consumes twice the Read Capacity Units (RCUs).

Ready to study AWS Certified Developer - Associate (DVA-C02)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free