Hands-On Lab: Exploring Specialized AWS AI Services
Specialized AI Services
Hands-On Lab: Exploring Specialized AWS AI Services
This guided lab provides hands-on experience with AWS's specialized AI services, which are purpose-built to integrate powerful machine learning capabilities—like natural language processing (NLP), text-to-speech, and computer vision—into applications without requiring deep data science expertise.
Prerequisites
Before starting this lab, ensure you have the following:
- AWS Account: An active AWS account with billing enabled (this lab is Free-Tier eligible).
- IAM Permissions: An IAM user or role with permissions for
comprehend:*,polly:*,rekognition:*, ands3:*. - AWS CLI: Installed and configured locally (
aws configure). - Prior Knowledge: Basic familiarity with terminal/command prompt navigation and JSON data structures.
Learning Objectives
By completing this lab, you will be able to:
- Extract sentiment and entities from unstructured text using Amazon Comprehend.
- Synthesize lifelike speech from text strings using Amazon Polly.
- Detect labels and objects in an image using Amazon Rekognition.
- Navigate between CLI and Console to interact with managed AI services.
Architecture Overview
The following diagram illustrates the flow of data from your local terminal to the specialized AWS AI services and the output returned.
AI Service Taxonomy
AWS organizes its specialized AI services by modality. Below is a conceptual mapping of the services we will use today.
Step-by-Step Instructions
Step 1: Analyze Text Sentiment with Amazon Comprehend
Amazon Comprehend uses natural language processing (NLP) to extract insights from text. In this step, we will analyze the sentiment of a customer review.
aws comprehend detect-sentiment \
--region us-east-1 \
--language-code en \
--text "I absolutely love the new specialized AI services from AWS, they made my project a breeze!"[!TIP] Notice how the API requires a
--language-code. Comprehend supports multiple languages, but you must specify which one it is analyzing, or use thedetect-dominant-languageAPI first.
▶📸 Console alternative
- Log into the AWS Management Console.
- Search for Amazon Comprehend.
- On the left navigation pane, choose Real-time analysis.
- Scroll down to Input text, paste the sentence above, and click Analyze.
- View the Sentiment tab in the results pane.
Step 2: Extract Entities with Amazon Comprehend
Beyond sentiment, Comprehend can identify entities like Organizations, Locations, and Persons from unstructured text.
aws comprehend detect-entities \
--region us-east-1 \
--language-code en \
--text "Jeff Bezos founded Amazon in Seattle, Washington in 1994."[!NOTE] The output is a JSON array of
Entities. Look at theTypefield for each entity to see how the model classified "Amazon" (ORGANIZATION) and "Seattle" (LOCATION).
Step 3: Synthesize Speech with Amazon Polly
Amazon Polly converts text into lifelike speech. We will synthesize an MP3 audio file using a neural voice named "Joanna".
aws polly synthesize-speech \
--region us-east-1 \
--output-format mp3 \
--voice-id Joanna \
--engine neural \
--text "Welcome to the AWS Certified AI Practitioner lab. You are doing great!" \
speech_output.mp3[!TIP] The
--engine neuralflag produces higher-quality, more natural-sounding speech compared to the standard engine. The resultingspeech_output.mp3file will be saved in your current working directory.
▶📸 Console alternative
- Navigate to Amazon Polly in the AWS Console.
- Select Text-to-Speech.
- Under Engine, select Neural.
- Select Joanna as the voice.
- Paste the text into the text box and click Listen or Download MP3.
Step 4: Prepare an Image for Computer Vision
To use Amazon Rekognition, we first need an image. We will create an S3 bucket and upload a sample image containing common objects (like a city street or a park).
First, create a bucket (replace <YOUR_ACCOUNT_ID> with some random numbers to ensure global uniqueness):
aws s3 mb s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID> --region us-east-1Next, download a sample image from the internet (or use one from your computer) and upload it to the bucket:
# Example command to upload a local image named 'sample.jpg'
aws s3 cp sample.jpg s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID>/sample.jpgStep 5: Detect Image Labels with Amazon Rekognition
Now, we'll ask Rekognition to analyze the image stored in S3 and return the top 5 labels (objects, scenes, or concepts) it detects.
aws rekognition detect-labels \
--region us-east-1 \
--image '{"S3Object":{"Bucket":"brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID>","Name":"sample.jpg"}}' \
--max-labels 5[!TIP] Review the JSON output. Each label includes a
Confidencescore (0-100%). In production applications, you typically set a confidence threshold (e.g., > 90%) before acting on a label.
▶📸 Console alternative
- Navigate to Amazon Rekognition in the Console.
- In the left menu, select Label detection.
- Click Upload image and select your
sample.jpgfile. - Expand the Results pane to see the detected labels and their confidence scores.
Checkpoints
Verify your progress by running the following validation steps:
- Checkpoint 1 (Comprehend): Does the JSON output from Step 1 contain
"Sentiment": "POSITIVE"? - Checkpoint 2 (Polly): Run
ls -l speech_output.mp3(Mac/Linux) ordir speech_output.mp3(Windows). Do you see a file size greater than 0 bytes? Play the file using your computer's media player. - Checkpoint 3 (Rekognition): Did Step 5 return an array of
"Labels"with"Confidence"scores?
Teardown / Clean-Up
[!WARNING] Remember to run the teardown commands to avoid ongoing charges. While S3 storage is cheap, leaving resources running is a bad practice.
Execute the following commands to delete the resources provisioned in this lab:
# 1. Delete the image from S3
aws s3 rm s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID>/sample.jpg
# 2. Delete the S3 bucket
aws s3 rb s3://brainybee-lab-ai-rekognition-<YOUR_ACCOUNT_ID> --force
# 3. Delete the local MP3 file
rm speech_output.mp3 # On Windows use: del speech_output.mp3Troubleshooting
| Error / Issue | Probable Cause | Fix / Solution |
|---|---|---|
AccessDeniedException | The IAM user lacks permissions for the specific service. | Attach the AWSOperations or specific service policies (e.g., AmazonRekognitionFullAccess) to your IAM user. |
BucketAlreadyExists | S3 bucket names must be globally unique across all AWS accounts. | Add more random numbers or your name to the end of the bucket name in Step 4. |
InvalidParameterValue in Polly | The requested voice or engine is not available in your region. | Ensure you are using --region us-east-1 and a valid VoiceId like Joanna. |
InvalidS3ObjectException | Rekognition cannot find the image in the specified S3 bucket. | Double-check the exact bucket name and image name in the Step 5 JSON payload. |
Cost Estimate
For a single run-through of this lab, the estimated cost is well within the AWS Free Tier. If you do not have Free Tier eligibility, the costs are approximately:
| Service | Usage in this Lab | Estimated Cost |
|---|---|---|
| Amazon Comprehend | 2 API requests (2 units) | $0.0002 |
| Amazon Polly | < 100 characters synthesized | $0.0004 |
| Amazon Rekognition | 1 image analyzed | $0.0010 |
| Amazon S3 | < 1 MB stored for a few minutes | $0.0000 |
| Total Estimated Spend | < $0.01 |
Concept Review
To solidify your understanding for the AWS Certified AI Practitioner (AIF-C01) exam, consider how these services fit into the broader ML ecosystem:
- Specialized AI Services (Comprehend, Polly, Rekognition) provide pre-trained models accessible via API. They require no machine learning expertise to use.
- Amazon SageMaker is used for the custom ML build-train-deploy lifecycle. You would use SageMaker if you needed to build a custom computer vision model from scratch, rather than using Rekognition's pre-trained labels.
- Generative AI Services (Amazon Bedrock, Amazon Q) are used for creating new content, whereas specialized AI services are primarily used for analyzing or translating existing content.