Hands-On Lab: Training and Fine-Tuning Foundation Models on AWS — AWS Certified AI Practitioner (AIF-C01) Study Notes | BrainyBee

Prerequisites

Before starting this lab, ensure you have the following prerequisites in place:

Cloud Account: An AWS Account with Administrator access.
CLI Tools: AWS CLI installed and configured (aws configure) with your access keys.
Model Access: Amazon Titan Text Lite model access enabled in the Amazon Bedrock console (under Model access).
Knowledge: Basic understanding of JSON structures and foundational AI concepts (Pre-training vs. Fine-tuning).

Learning Objectives

By completing this lab, you will be able to:

Prepare and format a dataset for instruction fine-tuning (JSONL format).
Upload training data to an Amazon S3 bucket using the AWS CLI.
Configure and launch a model customization (fine-tuning) job in Amazon Bedrock.
Understand the deployment phase of a fine-tuned model (Provisioned Throughput) and associated cost considerations.

Architecture Overview

The following diagram illustrates the flow of data and services used in this lab to fine-tune a Foundation Model (FM).

Loading Diagram...

To understand where fine-tuning fits into model customization, consider the difference between updating model weights versus retrieving external context:

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Step-by-Step Instructions

Step 1: Create an Amazon S3 Bucket for Training Data

First, we need a secure storage location for our fine-tuning dataset. We will create an S3 bucket.

bash

aws s3 mb s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID> --region us-east-1

▶💻 Console alternative

Navigate to the S3 Console.
Click Create bucket.
Enter the bucket name brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>.
Leave all other settings as default and click Create bucket.

[!TIP] S3 bucket names must be globally unique. Replace <YOUR_ACCOUNT_ID> with your actual AWS account number or a random string.

Step 2: Prepare the Fine-Tuning Dataset

Instruction fine-tuning requires data formatted in JSON Lines (.jsonl). Each line represents a single training example with a prompt and the expected completion.

Create a file named training-data.jsonl and add the following sample content:

bash

cat <<EOF > training-data.jsonl
{"prompt": "Classify this ticket: The billing page is returning a 500 error.", "completion": "Category: Technical Support | Priority: High"}
{"prompt": "Classify this ticket: I want to upgrade my subscription to the premium tier.", "completion": "Category: Sales | Priority: Medium"}
{"prompt": "Classify this ticket: How do I change my profile picture?", "completion": "Category: General Inquiry | Priority: Low"}
{"prompt": "Classify this ticket: The database is down and no one can log in!", "completion": "Category: Technical Support | Priority: Critical"}
EOF

Upload this file to your S3 bucket:

bash

aws s3 cp training-data.jsonl s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>/data/training-data.jsonl

Step 3: Create the Fine-Tuning Job in Amazon Bedrock

We will now instruct Bedrock to train a custom model using our data. While this can be done via CLI, the Console is highly recommended for beginners as it automatically provisions the necessary IAM roles.

If using the CLI, you must first create an IAM trust policy and role. For this step, we will use the console path.

▶💻 Console Instructions (Recommended)

Navigate to the Amazon Bedrock Console.
In the left navigation pane, under Foundation models, select Custom models.
Click Customize model > Create Fine-tuning job.
Model details:
- Base model: Select Amazon Titan Text Lite.
- Custom model name: ticket-classifier-model.
Job configuration:
- Job name: ticket-classifier-job.
- Input data: Provide the S3 URI: s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>/data/training-data.jsonl.
Hyperparameters: Leave defaults (Epochs, Batch size, Learning rate).
Output data: Specify the same S3 bucket: s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>/output/.
Service access: Select Create and use a new service role.
Click Create model customization job.

📸 Screenshot Placeholder: Amazon Bedrock Model Customization Job configuration screen.

Step 4: Monitor the Customization Job

Fine-tuning takes time (typically 30-60 minutes for small datasets). You can monitor the status using the CLI.

bash

aws bedrock list-model-customization-jobs --query "modelCustomizationJobSummaries[0].[jobName, status]"

[!NOTE] Wait until the status changes from InProgress to Completed before moving to the next step.

Step 5: (Optional) Provision Throughput for Inference

[!WARNING] COST ALERT: To query a custom model in Bedrock, you must purchase Provisioned Throughput. This involves an hourly charge and usually requires a 1-month commitment. DO NOT run this step in a personal account unless you are prepared for the cost.

If you are in a provided sandbox environment:

bash

aws bedrock create-provisioned-model-throughput \
    --provisioned-model-name ticket-classifier-throughput \
    --model-id arn:aws:bedrock:us-east-1:<YOUR_ACCOUNT_ID>:custom-model/ticket-classifier-model \
    --model-units 1

Once provisioned, you can test the model in the Bedrock Playground by selecting your Custom Model from the dropdown list.

Checkpoints

Verify your progress after Step 2 by checking the S3 bucket contents:

bash

aws s3 ls s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>/data/

Expected Output: ... training-data.jsonl

Verify your progress after Step 4 by checking the custom models list:

bash

aws bedrock list-custom-models --query "modelSummaries[?modelName=='ticket-classifier-model'].[modelName, creationTime]"

Expected Output: An array containing ticket-classifier-model and a timestamp.

Clean-Up / Teardown

[!IMPORTANT] Failure to clean up resources, especially Provisioned Throughput, will result in significant ongoing AWS charges.

Execute the following commands to tear down the lab environment:

1. Delete Provisioned Throughput (If created in Step 5):

bash

aws bedrock delete-provisioned-model-throughput --provisioned-model-id ticket-classifier-throughput

2. Delete the Custom Model:

bash

aws bedrock delete-custom-model --model-identifier arn:aws:bedrock:us-east-1:<YOUR_ACCOUNT_ID>:custom-model/ticket-classifier-model

3. Delete the S3 Bucket and its contents:

bash

aws s3 rm s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID> --recursive
aws s3 rb s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>

Troubleshooting

Common Error	Cause	Fix
`ValidationException` during job creation	The JSONL format is incorrect or contains invalid keys.	Ensure your dataset strictly uses `{"prompt": "...", "completion": "..."}` formats without trailing commas.
`AccessDeniedException`	The IAM role Bedrock is using lacks permissions to read from your S3 bucket.	If using the CLI, ensure the trust policy allows `bedrock.amazonaws.com` to assume the role, and the role has `s3:GetObject` on the bucket. Using the console wizard fixes this automatically.
`ResourceNotFound`	You have not requested access to the base Titan model in Bedrock.	Navigate to Bedrock > Model access, and request access to the Amazon Titan Text models.
Job fails after 5 minutes	Dataset is too small for the selected hyperparameters.	Amazon Bedrock requires a minimum number of tokens/lines depending on the model. Ensure you have at least 100-200 valid JSONL rows for real training.

Prerequisites

Before starting this lab, ensure you have the following prerequisites in place:

Cloud Account: An AWS Account with Administrator access.
CLI Tools: AWS CLI installed and configured (aws configure) with your access keys.
Model Access: Amazon Titan Text Lite model access enabled in the Amazon Bedrock console (under Model access).
Knowledge: Basic understanding of JSON structures and foundational AI concepts (Pre-training vs. Fine-tuning).

Learning Objectives

By completing this lab, you will be able to:

Prepare and format a dataset for instruction fine-tuning (JSONL format).
Upload training data to an Amazon S3 bucket using the AWS CLI.
Configure and launch a model customization (fine-tuning) job in Amazon Bedrock.
Understand the deployment phase of a fine-tuned model (Provisioned Throughput) and associated cost considerations.

Architecture Overview

The following diagram illustrates the flow of data and services used in this lab to fine-tune a Foundation Model (FM).

Loading Diagram...

To understand where fine-tuning fits into model customization, consider the difference between updating model weights versus retrieving external context:

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Step-by-Step Instructions

Step 1: Create an Amazon S3 Bucket for Training Data

First, we need a secure storage location for our fine-tuning dataset. We will create an S3 bucket.

bash

aws s3 mb s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID> --region us-east-1

▶💻 Console alternative

Navigate to the S3 Console.
Click Create bucket.
Enter the bucket name brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>.
Leave all other settings as default and click Create bucket.

[!TIP] S3 bucket names must be globally unique. Replace <YOUR_ACCOUNT_ID> with your actual AWS account number or a random string.

Step 2: Prepare the Fine-Tuning Dataset

Instruction fine-tuning requires data formatted in JSON Lines (.jsonl). Each line represents a single training example with a prompt and the expected completion.

Create a file named training-data.jsonl and add the following sample content:

bash

cat <<EOF > training-data.jsonl
{"prompt": "Classify this ticket: The billing page is returning a 500 error.", "completion": "Category: Technical Support | Priority: High"}
{"prompt": "Classify this ticket: I want to upgrade my subscription to the premium tier.", "completion": "Category: Sales | Priority: Medium"}
{"prompt": "Classify this ticket: How do I change my profile picture?", "completion": "Category: General Inquiry | Priority: Low"}
{"prompt": "Classify this ticket: The database is down and no one can log in!", "completion": "Category: Technical Support | Priority: Critical"}
EOF

Upload this file to your S3 bucket:

bash

aws s3 cp training-data.jsonl s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>/data/training-data.jsonl

Step 3: Create the Fine-Tuning Job in Amazon Bedrock

If using the CLI, you must first create an IAM trust policy and role. For this step, we will use the console path.

▶💻 Console Instructions (Recommended)

Navigate to the Amazon Bedrock Console.
In the left navigation pane, under Foundation models, select Custom models.
Click Customize model > Create Fine-tuning job.
Model details:
- Base model: Select Amazon Titan Text Lite.
- Custom model name: ticket-classifier-model.
Job configuration:
- Job name: ticket-classifier-job.
- Input data: Provide the S3 URI: s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>/data/training-data.jsonl.
Hyperparameters: Leave defaults (Epochs, Batch size, Learning rate).
Output data: Specify the same S3 bucket: s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>/output/.
Service access: Select Create and use a new service role.
Click Create model customization job.

📸 Screenshot Placeholder: Amazon Bedrock Model Customization Job configuration screen.

Step 4: Monitor the Customization Job

Fine-tuning takes time (typically 30-60 minutes for small datasets). You can monitor the status using the CLI.

bash

aws bedrock list-model-customization-jobs --query "modelCustomizationJobSummaries[0].[jobName, status]"

[!NOTE] Wait until the status changes from InProgress to Completed before moving to the next step.

Step 5: (Optional) Provision Throughput for Inference

[!WARNING] COST ALERT: To query a custom model in Bedrock, you must purchase Provisioned Throughput. This involves an hourly charge and usually requires a 1-month commitment. DO NOT run this step in a personal account unless you are prepared for the cost.

If you are in a provided sandbox environment:

bash

aws bedrock create-provisioned-model-throughput \
    --provisioned-model-name ticket-classifier-throughput \
    --model-id arn:aws:bedrock:us-east-1:<YOUR_ACCOUNT_ID>:custom-model/ticket-classifier-model \
    --model-units 1

Once provisioned, you can test the model in the Bedrock Playground by selecting your Custom Model from the dropdown list.

Checkpoints

Verify your progress after Step 2 by checking the S3 bucket contents:

bash

aws s3 ls s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>/data/

Expected Output: ... training-data.jsonl

Verify your progress after Step 4 by checking the custom models list:

bash

aws bedrock list-custom-models --query "modelSummaries[?modelName=='ticket-classifier-model'].[modelName, creationTime]"

Expected Output: An array containing ticket-classifier-model and a timestamp.

Clean-Up / Teardown

[!IMPORTANT] Failure to clean up resources, especially Provisioned Throughput, will result in significant ongoing AWS charges.

Execute the following commands to tear down the lab environment:

1. Delete Provisioned Throughput (If created in Step 5):

bash

aws bedrock delete-provisioned-model-throughput --provisioned-model-id ticket-classifier-throughput

2. Delete the Custom Model:

bash

aws bedrock delete-custom-model --model-identifier arn:aws:bedrock:us-east-1:<YOUR_ACCOUNT_ID>:custom-model/ticket-classifier-model

3. Delete the S3 Bucket and its contents:

bash

aws s3 rm s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID> --recursive
aws s3 rb s3://brainybee-lab-finetuning-<YOUR_ACCOUNT_ID>

Troubleshooting

Common Error	Cause	Fix
`ValidationException` during job creation	The JSONL format is incorrect or contains invalid keys.	Ensure your dataset strictly uses `{"prompt": "...", "completion": "..."}` formats without trailing commas.
`AccessDeniedException`	The IAM role Bedrock is using lacks permissions to read from your S3 bucket.	If using the CLI, ensure the trust policy allows `bedrock.amazonaws.com` to assume the role, and the role has `s3:GetObject` on the bucket. Using the console wizard fixes this automatically.
`ResourceNotFound`	You have not requested access to the base Titan model in Bedrock.	Navigate to Bedrock > Model access, and request access to the Amazon Titan Text models.
Job fails after 5 minutes	Dataset is too small for the selected hyperparameters.	Amazon Bedrock requires a minimum number of tokens/lines depending on the model. Ensure you have at least 100-200 valid JSONL rows for real training.