Hands-On Lab: Implementing AWS Object Storage and Archival (S3 & Glacier)
AWS storage services
Hands-On Lab: Implementing AWS Object Storage and Archival (S3 & Glacier)
Prerequisites
Before beginning this lab, ensure you have the following:
- AWS Account: An active AWS account with billing enabled (this lab falls under the AWS Free Tier, but a card is required).
- IAM Permissions: An IAM user or role with
AmazonS3FullAccesspermissions. - CLI Tools: The AWS Command Line Interface (
aws-cli) installed and configured with your credentials. - Knowledge: Basic understanding of object-based storage versus block-based storage.
[!NOTE] Cost Estimate: $0.00. Amazon S3 offers 5GB of standard storage for 12 months under the Free Tier. S3 Glacier is extremely low cost ($0.004/GB). However, you must follow the teardown instructions to avoid any unexpected ongoing charges.
Learning Objectives
Upon completing this lab, you will be able to:
- Deploy Object Storage: Create an Amazon Simple Storage Service (S3) bucket to store unstructured data.
- Upload and Manage Data: Transfer files from your local environment to the AWS Cloud.
- Optimize Storage Costs: Configure a Lifecycle Policy to automatically transition infrequently accessed data to S3 Glacier.
Architecture Overview
Amazon S3 is Amazon's flagship object-based cloud storage service, boasting 99.999999999% (11 9's) of durability. In this lab, you will build a pipeline that ingests data into standard storage and automatically archives it to S3 Glacier.
The following diagram illustrates the storage class tiering you will implement, moving from frequent access to archival storage to save costs.
Step-by-Step Instructions
Step 1: Create an S3 Bucket
Amazon S3 stores data as objects within containers called "buckets." Bucket names must be globally unique across all AWS accounts.
aws s3api create-bucket \
--bucket brainybee-lab-storage-<YOUR_RANDOM_NUM> \
--region us-east-1[!TIP] Replace
<YOUR_RANDOM_NUM>with a unique string of numbers (e.g.,837492) to ensure your bucket name does not conflict with anyone else's in AWS.
▶Console alternative
- Navigate to the S3 Console.
- Click Create bucket.
- Under Bucket name, enter
brainybee-lab-storage-<YOUR_RANDOM_NUM>. - Leave the default region (e.g.,
us-east-1) and all other default settings (Block Public Access should be enabled). - Scroll to the bottom and click Create bucket.
📸 Screenshot: The S3 bucket list showing your newly created bucket with a "Success" banner.
Step 2: Upload a File to the Bucket
You will now create a sample file locally and upload it to your newly created S3 bucket.
# 1. Create a sample text file
echo "This is my highly valuable corporate data." > valuable_data.txt
# 2. Upload the file to S3
aws s3 cp valuable_data.txt s3://brainybee-lab-storage-<YOUR_RANDOM_NUM>/▶Console alternative
- In the S3 Console, click on your bucket name (
brainybee-lab-storage-<YOUR_RANDOM_NUM>). - Click the Upload button.
- Click Add files and select any small text or image file from your computer.
- Click the Upload button at the bottom of the screen.
- Click Close once the green success banner appears.
Step 3: Create a Lifecycle Rule for Archiving
Data that isn't accessed frequently shouldn't sit in standard storage. We will create a rule that automatically moves data to S3 Glacier after 30 days.
# 1. Create a JSON file defining the lifecycle rule
echo '{
"Rules": [
{
"ID": "ArchiveToGlacier",
"Filter": { "Prefix": "" },
"Status": "Enabled",
"Transitions": [
{
"Days": 30,
"StorageClass": "GLACIER"
}
]
}
]
}' > lifecycle.json
# 2. Apply the lifecycle rule to your bucket
aws s3api put-bucket-lifecycle-configuration \
--bucket brainybee-lab-storage-<YOUR_RANDOM_NUM> \
--lifecycle-configuration file://lifecycle.json▶Console alternative
- In your bucket, click the Management tab.
- Under Lifecycle rules, click Create lifecycle rule.
- Rule name:
ArchiveToGlacier. - Choose Apply to all objects in the bucket and check the acknowledgment box.
- Under Lifecycle rule actions, select Move current versions of objects between storage classes.
- In the transition dropdown, select Glacier Flexible Retrieval and set Days after object creation to
30. - Click Create rule.
📸 Screenshot: The Lifecycle rules table displaying the active "ArchiveToGlacier" rule.
Checkpoints
Verify that your configuration was successful by running the following checkpoint commands.
Checkpoint 1: Verify the object upload (After Step 2)
aws s3 ls s3://brainybee-lab-storage-<YOUR_RANDOM_NUM>/Expected Result: You should see a timestamp, the file size, and the filename (valuable_data.txt) output in the terminal.
Checkpoint 2: Verify the Lifecycle Rule (After Step 3)
aws s3api get-bucket-lifecycle-configuration \
--bucket brainybee-lab-storage-<YOUR_RANDOM_NUM>Expected Result: You should receive a JSON response showing the rule ID ArchiveToGlacier, confirming that files transition to GLACIER after 30 days.
Teardown
[!WARNING] Remember to run the teardown commands to avoid ongoing charges. You must empty an S3 bucket before AWS allows you to delete it.
Clean up your environment by executing the following terminal commands:
# 1. Delete all objects inside the bucket
aws s3 rm s3://brainybee-lab-storage-<YOUR_RANDOM_NUM> --recursive
# 2. Delete the bucket itself
aws s3api delete-bucket \
--bucket brainybee-lab-storage-<YOUR_RANDOM_NUM> \
--region us-east-1
# 3. Clean up local files
rm valuable_data.txt lifecycle.jsonVerify the deletion by ensuring the following command returns a NoSuchBucket error:
aws s3 ls s3://brainybee-lab-storage-<YOUR_RANDOM_NUM>Troubleshooting
| Common Error Message | Likely Cause | Solution |
|---|---|---|
BucketAlreadyExists | S3 bucket names must be globally unique. Someone else is using this name. | Change <YOUR_RANDOM_NUM> to a different, more random string and try again. |
AccessDenied | Your AWS CLI user lacks the necessary S3 permissions. | Ensure your IAM user has the AmazonS3FullAccess policy attached in the AWS IAM Console. |
BucketNotEmpty (During Teardown) | You attempted to delete a bucket that still contains objects. | Run the aws s3 rm ... --recursive command first to clear the bucket before deleting it. |
InvalidRequest on Lifecycle rule | You attempted to transition to GLACIER in less than 0 days, or format is wrong. | Ensure your JSON uses exact syntax. Minimum transition to Glacier is usually 0 days (immediate), but standard practice is 30. Check JSON braces. |