Hands-On Lab1,269 words

Hands-On Lab: Exploring Foundation Model Design Considerations with Amazon Bedrock

Design considerations for applications that use foundation models (FMs)

Hands-On Lab: Exploring Foundation Model Design Considerations with Amazon Bedrock

Estimated Time: 30 minutes Difficulty: Guided Cloud Provider: AWS

Foundation models (FMs) act as sophisticated universal translators that can understand and generate human-like text, code, and multimodal content. In this lab, we will explore the practical design considerations of using FMs—specifically how model selection, prompt design, and inference parameters (like temperature and max tokens) impact output, cost, and latency.


Prerequisites

Before starting this lab, ensure you have the following:

  • AWS Account: An active AWS account with Administrator or PowerUser permissions.
  • AWS CLI: Installed and configured locally with your credentials (aws configure).
  • Familiarity: Basic understanding of JSON and terminal/command-line operations.
  • Region Selection: Use us-east-1 (N. Virginia) or us-west-2 (Oregon) as Amazon Bedrock model availability is highest in these regions.

Learning Objectives

By the end of this lab, you will be able to:

  1. Request and manage Foundation Model access within Amazon Bedrock.
  2. Invoke an FM using the AWS CLI and the AWS Management Console.
  3. Modify inference parameters (e.g., temperature, input/output length) and observe their effect on model responses.
  4. Evaluate design considerations regarding token-based pricing, latency, and response accuracy.

Architecture Overview

The architecture for this lab is straightforward, focusing on the interaction between a client interface and the managed Amazon Bedrock service.

Loading Diagram...

Step-by-Step Instructions

Step 1: Request Model Access in Amazon Bedrock

By default, access to Foundation Models in Amazon Bedrock is not enabled. You must explicitly request access, which represents a key compliance and governance design consideration.

bash
# Check currently available models in your region aws bedrock list-foundation-models --query "modelSummaries[*].[modelId, modelName]" --output table
Console alternative: Requesting Access
  1. Log in to the AWS Management Console.
  2. Navigate to Amazon Bedrock.
  3. In the left navigation pane, scroll down and click on Model access.
  4. Click the Manage model access button (top right).
  5. Check the box next to Titan Text G1 - Lite (under Amazon).
  6. Scroll to the bottom and click Save changes.
  7. Wait a few moments until the Access status changes to Access granted.

📸 Screenshot: Checkbox selected next to "Titan Text G1 - Lite" with "Access granted" badge.

[!IMPORTANT] For the CLI execution of model access, it is highly recommended to use the Console as AWS requires accepting EULAs (End User License Agreements) for certain third-party models which cannot easily be done via CLI.

Step 2: Invoke a Model with Default Parameters

Now that you have access, let's invoke the amazon.titan-text-lite-v1 model. We will ask it to explain a complex topic to evaluate its default behavior.

Create a file named payload-default.json with the following content:

bash
cat <<EOF > payload-default.json { "inputText": "Explain the business value of Generative AI in two sentences.", "textGenerationConfig": { "maxTokenCount": 100, "temperature": 0.7 } } EOF

Now, invoke the model using the Bedrock Runtime:

bash
aws bedrock-runtime invoke-model \ --model-id amazon.titan-text-lite-v1 \ --body fileb://payload-default.json \ --cli-binary-format raw-in-base64-out \ --accept "application/json" \ --content-type "application/json" \ output-default.txt # View the result cat output-default.txt
Console alternative: Bedrock Playground
  1. In the Amazon Bedrock console, navigate to Playgrounds > Text.
  2. Click Select model, choose Amazon, and then select Titan Text G1 - Lite.
  3. Click Apply.
  4. In the chat box, type: "Explain the business value of Generative AI in two sentences."
  5. Click Run and observe the output.

📸 Screenshot: The Bedrock Text Playground showing the prompt and the model's generated response.

Step 3: Observe the Effect of Inference Parameters (Temperature)

Temperature controls the randomness (or creativity) of the model. A lower temperature (e.g., 0.0) produces deterministic, factual answers. A higher temperature (e.g., 0.9) produces more creative but potentially unpredictable answers (increasing the risk of hallucinations).

Create a new payload with a temperature of 0.0:

bash
cat <<EOF > payload-strict.json { "inputText": "Explain the business value of Generative AI in two sentences.", "textGenerationConfig": { "maxTokenCount": 100, "temperature": 0.0 } } EOF aws bedrock-runtime invoke-model \ --model-id amazon.titan-text-lite-v1 \ --body fileb://payload-strict.json \ --cli-binary-format raw-in-base64-out \ --accept "application/json" \ --content-type "application/json" \ output-strict.txt cat output-strict.txt

[!TIP] Compare output-default.txt and output-strict.txt. If you run the 0.0 temperature prompt multiple times, the output will remain nearly identical. This consistency is a critical design consideration for enterprise applications like customer support bots.

Step 4: Evaluate Output Length and Cost Trade-offs

Generative AI models are billed based on token consumption (input tokens + output tokens). If you set maxTokenCount too high, a verbose model might generate unnecessarily long answers, driving up costs and latency.

Let's constrain the model to a very short token count:

bash
cat <<EOF > payload-short.json { "inputText": "Explain the business value of Generative AI in two sentences.", "textGenerationConfig": { "maxTokenCount": 20, "temperature": 0.5 } } EOF aws bedrock-runtime invoke-model \ --model-id amazon.titan-text-lite-v1 \ --body fileb://payload-short.json \ --cli-binary-format raw-in-base64-out \ --accept "application/json" \ --content-type "application/json" \ output-short.txt cat output-short.txt

[!NOTE] Notice how the response in output-short.txt is likely cut off mid-sentence. When designing applications, you must balance cost (lower max tokens) with performance requirements (ensuring complete, coherent answers).


Checkpoints

Use these commands to verify your progress:

Checkpoint 1: Verify Payload Creation

bash
ls -l payload-*.json # Expected result: You should see payload-default.json, payload-strict.json, and payload-short.json.

Checkpoint 2: Verify Successful Invocations

bash
cat output-short.txt | grep -q "results" # If this command returns no error (exit code 0), your JSON response was successfully captured.

Concept Review: Customization vs. Prompting

As you explored inference parameters, remember that modifying inputs is just one way to control FMs. Here is a brief comparison of how you might adapt models for business applications:

ApproachCost/EffortUse CaseExample
Prompt EngineeringLowFormatting outputs, basic contextZero-shot, Few-shot learning
RAG (Retrieval-Augmented Generation)MediumProviding up-to-date, domain-specific facts to prevent hallucinationsAmazon Bedrock Knowledge Bases
Fine-TuningHighAdapting tone, style, or highly specialized domain languageInstruction tuning a model for medical terminology
Pre-trainingVery HighCreating a brand new foundational model from scratchTraining a new multi-lingual model

Teardown

[!WARNING] While keeping Model Access enabled in Bedrock does not incur ongoing charges, generating tokens does. Clean up your local environment to ensure no sensitive data or credentials are left behind.

Run the following commands to delete the local artifacts generated during this lab:

bash
# Remove all generated payload and output files rm payload-default.json payload-strict.json payload-short.json rm output-default.txt output-strict.txt output-short.txt echo "Cleanup complete!"
Console alternative: Revoking Model Access

If you wish to cleanly revoke access:

  1. Navigate back to Amazon Bedrock > Model access.
  2. Click Manage model access.
  3. Uncheck Titan Text G1 - Lite.
  4. Click Save changes.

Troubleshooting

Common ErrorCauseFix
AccessDeniedExceptionYou have not requested access to the specific model in the Bedrock console.Go to the Bedrock Console -> Model Access, and explicitly enable the model.
ValidationExceptionThe JSON payload structure is incorrect for the chosen model.Different models (e.g., Claude vs. Titan) require different JSON schemas. Check the Bedrock documentation for the specific model's payload structure.
UnrecognizedClientExceptionAWS CLI is not configured, or credentials have expired.Run aws configure and provide valid access keys.
ThrottlingExceptionToo many requests sent to the model in a short period.Wait a few seconds and try the invocation again.

Stretch Challenge

Challenge: Try invoking a different model, such as Anthropic's anthropic.claude-v2 or anthropic.claude-3-haiku-20240307-v1:0 (if available in your region). Constraint: You will need to research the specific JSON payload format required by Anthropic models, as it differs from Amazon Titan. Attempt to prompt the model to adopt a specific "persona" (e.g., "You are a helpful AWS Cloud Architect...") using prompt engineering techniques.

Show Solution
bash
# Note: Ensure you have requested access to Anthropic Claude in the Console first. cat <<EOF > claude-payload.json { "anthropic_version": "bedrock-2023-05-31", "max_tokens": 200, "messages": [ { "role": "user", "content": "You are a helpful AWS Cloud Architect. Explain RAG to a beginner." } ], "temperature": 0.5 } EOF aws bedrock-runtime invoke-model \ --model-id anthropic.claude-3-haiku-20240307-v1:0 \ --body fileb://claude-payload.json \ --cli-binary-format raw-in-base64-out \ --accept "application/json" \ --content-type "application/json" \ claude-output.txt cat claude-output.txt

Cost Estimate

Amazon Bedrock charges based on tokens processed (both input and output).

  • Amazon Titan Text Lite: ~$0.0003 per 1,000 input tokens / ~$0.0004 per 1,000 output tokens.
  • The prompts and responses in this lab consist of less than 500 tokens total.
  • Total estimated cost for this lab: < $0.01 (Virtually free).
  • Note: There are no hourly provisioning charges unless you are using Provisioned Throughput, which we did not use in this lab (we used On-Demand).

Ready to study AWS Certified AI Practitioner (AIF-C01)?

Practice tests, flashcards, and all study notes — free, no sign-up needed.

Start Studying — Free