Hands-On Lab: Exploring Specialized AWS AI Services

This guided lab provides hands-on experience with AWS's specialized AI services, which are purpose-built to integrate powerful machine learning capabilities—like natural language processing (NLP), text-to-speech, and computer vision—into applications without requiring deep data science expertise.

Prerequisites

Before starting this lab, ensure you have the following:

AWS Account: An active AWS account with billing enabled (this lab is Free-Tier eligible).
IAM Permissions: An IAM user or role with permissions for comprehend:*, polly:*, rekognition:*, and s3:*.
AWS CLI: Installed and configured locally (aws configure).
Prior Knowledge: Basic familiarity with terminal/command prompt navigation and JSON data structures.

Learning Objectives

By completing this lab, you will be able to:

Extract sentiment and entities from unstructured text using Amazon Comprehend.
Synthesize lifelike speech from text strings using Amazon Polly.
Detect labels and objects in an image using Amazon Rekognition.
Navigate between CLI and Console to interact with managed AI services.

Architecture Overview

The following diagram illustrates the flow of data from your local terminal to the specialized AWS AI services and the output returned.

Loading Diagram...

AI Service Taxonomy

AWS organizes its specialized AI services by modality. Below is a conceptual mapping of the services we will use today.

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Step-by-Step Instructions

Step 1: Analyze Text Sentiment with Amazon Comprehend

Amazon Comprehend uses natural language processing (NLP) to extract insights from text. In this step, we will analyze the sentiment of a customer review.

bash

aws comprehend detect-sentiment \
  --region us-east-1 \
  --language-code en \
  --text "I absolutely love the new specialized AI services from AWS, they made my project a breeze!"

[!TIP] Notice how the API requires a --language-code. Comprehend supports multiple languages, but you must specify which one it is analyzing, or use the detect-dominant-language API first.

▶📸 Console alternative

Log into the AWS Management Console.
Search for Amazon Comprehend.
On the left navigation pane, choose Real-time analysis.
Scroll down to Input text, paste the sentence above, and click Analyze.
View the Sentiment tab in the results pane.

Step 2: Extract Entities with Amazon Comprehend

Beyond sentiment, Comprehend can identify entities like Organizations, Locations, and Persons from unstructured text.

bash

aws comprehend detect-entities \
  --region us-east-1 \
  --language-code en \
  --text "Jeff Bezos founded Amazon in Seattle, Washington in 1994."

[!NOTE] The output is a JSON array of Entities. Look at the Type field for each entity to see how the model classified "Amazon" (ORGANIZATION) and "Seattle" (LOCATION).

Step 3: Synthesize Speech with Amazon Polly

Amazon Polly converts text into lifelike speech. We will synthesize an MP3 audio file using a neural voice named "Joanna".

bash

aws polly synthesize-speech \
  --region us-east-1 \
  --output-format mp3 \
  --voice-id Joanna \
  --engine neural \
  --text "Welcome to the AWS Certified AI Practitioner lab. You are doing great!" \
  speech_output.mp3

[!TIP] The --engine neural flag produces higher-quality, more natural-sounding speech compared to the standard engine. The resulting speech_output.mp3 file will be saved in your current working directory.

▶📸 Console alternative

Navigate to Amazon Polly in the AWS Console.
Select Text-to-Speech.
Under Engine, select Neural.
Select Joanna as the voice.
Paste the text into the text box and click Listen or Download MP3.

Step 4: Prepare an Image for Computer Vision

To use Amazon Rekognition, we first need an image. We will create an S3 bucket and upload a sample image containing common objects (like a city street or a park).

First, create a bucket (replace <YOUR_ACCOUNT_ID> with some random numbers to ensure global uniqueness):

bash

aws s3 mb s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID> --region us-east-1

Next, download a sample image from the internet (or use one from your computer) and upload it to the bucket:

bash

# Example command to upload a local image named 'sample.jpg'
aws s3 cp sample.jpg s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID>/sample.jpg

Step 5: Detect Image Labels with Amazon Rekognition

Now, we'll ask Rekognition to analyze the image stored in S3 and return the top 5 labels (objects, scenes, or concepts) it detects.

bash

aws rekognition detect-labels \
  --region us-east-1 \
  --image '{"S3Object":{"Bucket":"brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID>","Name":"sample.jpg"}}' \
  --max-labels 5

[!TIP] Review the JSON output. Each label includes a Confidence score (0-100%). In production applications, you typically set a confidence threshold (e.g., > 90%) before acting on a label.

▶📸 Console alternative

Navigate to Amazon Rekognition in the Console.
In the left menu, select Label detection.
Click Upload image and select your sample.jpg file.
Expand the Results pane to see the detected labels and their confidence scores.

Checkpoints

Verify your progress by running the following validation steps:

Checkpoint 1 (Comprehend): Does the JSON output from Step 1 contain "Sentiment": "POSITIVE"?
Checkpoint 2 (Polly): Run ls -l speech_output.mp3 (Mac/Linux) or dir speech_output.mp3 (Windows). Do you see a file size greater than 0 bytes? Play the file using your computer's media player.
Checkpoint 3 (Rekognition): Did Step 5 return an array of "Labels" with "Confidence" scores?

Teardown / Clean-Up

[!WARNING] Remember to run the teardown commands to avoid ongoing charges. While S3 storage is cheap, leaving resources running is a bad practice.

Execute the following commands to delete the resources provisioned in this lab:

bash

# 1. Delete the image from S3
aws s3 rm s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID>/sample.jpg

# 2. Delete the S3 bucket
aws s3 rb s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID> --force

# 3. Delete the local MP3 file
rm speech_output.mp3   # On Windows use: del speech_output.mp3

Troubleshooting

Error / Issue	Probable Cause	Fix / Solution
`AccessDeniedException`	The IAM user lacks permissions for the specific service.	Attach the `AWSOperations` or specific service policies (e.g., `AmazonRekognitionFullAccess`) to your IAM user.
`BucketAlreadyExists`	S3 bucket names must be globally unique across all AWS accounts.	Add more random numbers or your name to the end of the bucket name in Step 4.
`InvalidParameterValue` in Polly	The requested voice or engine is not available in your region.	Ensure you are using `--region us-east-1` and a valid VoiceId like `Joanna`.
`InvalidS3ObjectException`	Rekognition cannot find the image in the specified S3 bucket.	Double-check the exact bucket name and image name in the Step 5 JSON payload.

Cost Estimate

For a single run-through of this lab, the estimated cost is well within the AWS Free Tier. If you do not have Free Tier eligibility, the costs are approximately:

Service	Usage in this Lab	Estimated Cost
Amazon Comprehend	2 API requests (2 units)	$0.0002
Amazon Polly	< 100 characters synthesized	$0.0004
Amazon Rekognition	1 image analyzed	$0.0010
Amazon S3	< 1 MB stored for a few minutes	$0.0000
Total Estimated Spend		< $0.01

Concept Review

To solidify your understanding for the AWS Certified AI Practitioner (AIF-C01) exam, consider how these services fit into the broader ML ecosystem:

Specialized AI Services (Comprehend, Polly, Rekognition) provide pre-trained models accessible via API. They require no machine learning expertise to use.
Amazon SageMaker is used for the custom ML build-train-deploy lifecycle. You would use SageMaker if you needed to build a custom computer vision model from scratch, rather than using Rekognition's pre-trained labels.
Generative AI Services (Amazon Bedrock, Amazon Q) are used for creating new content, whereas specialized AI services are primarily used for analyzing or translating existing content.

Hands-On Lab: Exploring Specialized AWS AI Services

Prerequisites

Before starting this lab, ensure you have the following:

AWS Account: An active AWS account with billing enabled (this lab is Free-Tier eligible).
IAM Permissions: An IAM user or role with permissions for comprehend:*, polly:*, rekognition:*, and s3:*.
AWS CLI: Installed and configured locally (aws configure).
Prior Knowledge: Basic familiarity with terminal/command prompt navigation and JSON data structures.

Learning Objectives

By completing this lab, you will be able to:

Extract sentiment and entities from unstructured text using Amazon Comprehend.
Synthesize lifelike speech from text strings using Amazon Polly.
Detect labels and objects in an image using Amazon Rekognition.
Navigate between CLI and Console to interact with managed AI services.

Architecture Overview

The following diagram illustrates the flow of data from your local terminal to the specialized AWS AI services and the output returned.

Loading Diagram...

AI Service Taxonomy

AWS organizes its specialized AI services by modality. Below is a conceptual mapping of the services we will use today.

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Step-by-Step Instructions

Step 1: Analyze Text Sentiment with Amazon Comprehend

Amazon Comprehend uses natural language processing (NLP) to extract insights from text. In this step, we will analyze the sentiment of a customer review.

bash

aws comprehend detect-sentiment \
  --region us-east-1 \
  --language-code en \
  --text "I absolutely love the new specialized AI services from AWS, they made my project a breeze!"

[!TIP] Notice how the API requires a --language-code. Comprehend supports multiple languages, but you must specify which one it is analyzing, or use the detect-dominant-language API first.

▶📸 Console alternative

Log into the AWS Management Console.
Search for Amazon Comprehend.
On the left navigation pane, choose Real-time analysis.
Scroll down to Input text, paste the sentence above, and click Analyze.
View the Sentiment tab in the results pane.

Step 2: Extract Entities with Amazon Comprehend

Beyond sentiment, Comprehend can identify entities like Organizations, Locations, and Persons from unstructured text.

bash

aws comprehend detect-entities \
  --region us-east-1 \
  --language-code en \
  --text "Jeff Bezos founded Amazon in Seattle, Washington in 1994."

[!NOTE] The output is a JSON array of Entities. Look at the Type field for each entity to see how the model classified "Amazon" (ORGANIZATION) and "Seattle" (LOCATION).

Step 3: Synthesize Speech with Amazon Polly

Amazon Polly converts text into lifelike speech. We will synthesize an MP3 audio file using a neural voice named "Joanna".

bash

aws polly synthesize-speech \
  --region us-east-1 \
  --output-format mp3 \
  --voice-id Joanna \
  --engine neural \
  --text "Welcome to the AWS Certified AI Practitioner lab. You are doing great!" \
  speech_output.mp3

[!TIP] The --engine neural flag produces higher-quality, more natural-sounding speech compared to the standard engine. The resulting speech_output.mp3 file will be saved in your current working directory.

▶📸 Console alternative

Navigate to Amazon Polly in the AWS Console.
Select Text-to-Speech.
Under Engine, select Neural.
Select Joanna as the voice.
Paste the text into the text box and click Listen or Download MP3.

Step 4: Prepare an Image for Computer Vision

To use Amazon Rekognition, we first need an image. We will create an S3 bucket and upload a sample image containing common objects (like a city street or a park).

First, create a bucket (replace <YOUR_ACCOUNT_ID> with some random numbers to ensure global uniqueness):

bash

aws s3 mb s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID> --region us-east-1

Next, download a sample image from the internet (or use one from your computer) and upload it to the bucket:

bash

# Example command to upload a local image named 'sample.jpg'
aws s3 cp sample.jpg s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID>/sample.jpg

Step 5: Detect Image Labels with Amazon Rekognition

Now, we'll ask Rekognition to analyze the image stored in S3 and return the top 5 labels (objects, scenes, or concepts) it detects.

bash

aws rekognition detect-labels \
  --region us-east-1 \
  --image '{"S3Object":{"Bucket":"brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID>","Name":"sample.jpg"}}' \
  --max-labels 5

[!TIP] Review the JSON output. Each label includes a Confidence score (0-100%). In production applications, you typically set a confidence threshold (e.g., > 90%) before acting on a label.

▶📸 Console alternative

Navigate to Amazon Rekognition in the Console.
In the left menu, select Label detection.
Click Upload image and select your sample.jpg file.
Expand the Results pane to see the detected labels and their confidence scores.

Checkpoints

Verify your progress by running the following validation steps:

Checkpoint 1 (Comprehend): Does the JSON output from Step 1 contain "Sentiment": "POSITIVE"?
Checkpoint 2 (Polly): Run ls -l speech_output.mp3 (Mac/Linux) or dir speech_output.mp3 (Windows). Do you see a file size greater than 0 bytes? Play the file using your computer's media player.
Checkpoint 3 (Rekognition): Did Step 5 return an array of "Labels" with "Confidence" scores?

Teardown / Clean-Up

[!WARNING] Remember to run the teardown commands to avoid ongoing charges. While S3 storage is cheap, leaving resources running is a bad practice.

Execute the following commands to delete the resources provisioned in this lab:

bash

# 1. Delete the image from S3
aws s3 rm s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID>/sample.jpg

# 2. Delete the S3 bucket
aws s3 rb s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID> --force

# 3. Delete the local MP3 file
rm speech_output.mp3   # On Windows use: del speech_output.mp3

Troubleshooting

Error / Issue	Probable Cause	Fix / Solution
`AccessDeniedException`	The IAM user lacks permissions for the specific service.	Attach the `AWSOperations` or specific service policies (e.g., `AmazonRekognitionFullAccess`) to your IAM user.
`BucketAlreadyExists`	S3 bucket names must be globally unique across all AWS accounts.	Add more random numbers or your name to the end of the bucket name in Step 4.
`InvalidParameterValue` in Polly	The requested voice or engine is not available in your region.	Ensure you are using `--region us-east-1` and a valid VoiceId like `Joanna`.
`InvalidS3ObjectException`	Rekognition cannot find the image in the specified S3 bucket.	Double-check the exact bucket name and image name in the Step 5 JSON payload.

Cost Estimate

For a single run-through of this lab, the estimated cost is well within the AWS Free Tier. If you do not have Free Tier eligibility, the costs are approximately:

Service	Usage in this Lab	Estimated Cost
Amazon Comprehend	2 API requests (2 units)	$0.0002
Amazon Polly	< 100 characters synthesized	$0.0004
Amazon Rekognition	1 image analyzed	$0.0010
Amazon S3	< 1 MB stored for a few minutes	$0.0000
Total Estimated Spend		< $0.01

Concept Review

To solidify your understanding for the AWS Certified AI Practitioner (AIF-C01) exam, consider how these services fit into the broader ML ecosystem:

Specialized AI Services (Comprehend, Polly, Rekognition) provide pre-trained models accessible via API. They require no machine learning expertise to use.
Amazon SageMaker is used for the custom ML build-train-deploy lifecycle. You would use SageMaker if you needed to build a custom computer vision model from scratch, rather than using Rekognition's pre-trained labels.
Generative AI Services (Amazon Bedrock, Amazon Q) are used for creating new content, whereas specialized AI services are primarily used for analyzing or translating existing content.