Mastering Computer Vision Workloads: AI-900 Curriculum Overview
Identify computer vision workloads
Curriculum Overview: Identifying Computer Vision Workloads
This document outlines the training path for mastering computer vision workloads as defined by the Microsoft Azure AI Fundamentals (AI-900) curriculum. Computer vision is a critical domain of Artificial Intelligence that enables software to interpret and understand visual information from the world, representing approximately 15–20% of the AI-900 exam content.
Prerequisites
Before diving into computer vision workloads, learners should possess a foundational understanding of the following:
- Basic Cloud Concepts: Familiarity with cloud computing models (IaaS, PaaS, SaaS) and general Azure architecture.
- General AI Awareness: Understanding that AI models are trained using data and used to make predictions.
- Data Literacy: Basic knowledge of digital file types (JPEG, PNG, MP4) and how data is structured.
- Mathematics Basics: High-level awareness of coordinate systems ( coordinates) which are used for bounding boxes in image analysis.
Module Breakdown
The curriculum is structured into a logical progression, starting from high-level workload identification to specific Azure service implementation.
| Module | Topic | Difficulty | Focus Area |
|---|---|---|---|
| 1 | Foundations of Vision | Beginner | Identifying what computer vision can and cannot do. |
| 2 | Image Analysis & Classification | Intermediate | Categorizing images and tagging content. |
| 3 | Object Detection & Spatial Analysis | Intermediate | Locating specific items and defining boundaries. |
| 4 | Optical Character Recognition (OCR) | Intermediate | Extracting text from images and documents. |
| 5 | Facial Analysis & Responsible AI | Advanced | Facial detection vs. recognition and ethical guardrails. |
Learning Objectives per Module
Module 1: Foundations of Computer Vision
- Define the concept of "Machine Seeing."
- Understand how digital images are processed as arrays of pixel values.
Module 2: Image Analysis and Classification
- Image Classification: Identifying the primary subject of an image (e.g., "This is a car").
- Image Tagging: Generating descriptive metadata for an image.
Module 3: Object Detection and Spatial Analysis
- Object Detection: Identifying individual objects within an image and providing their location via bounding boxes.
- Semantic Segmentation: Mapping specific pixels to specific objects (more granular than object detection).
Module 4: Optical Character Recognition (OCR)
- Use Azure AI Vision to extract printed or handwritten text.
- Differentiate between simple OCR and Document Intelligence.
Module 5: Facial Detection and Analysis
- Distinguish between Facial Detection (finding a face) and Facial Recognition (identifying who the face belongs to).
- Analyze facial attributes (age, emotion, glasses).
Success Metrics
To demonstrate mastery of this curriculum, the learner must be able to:
- Workload Identification: Correctly match a business scenario (e.g., "A retailer needs to count people entering a store") to the appropriate AI workload (Object Detection).
- Service Selection: Choose between Azure AI Vision and Azure AI Face based on the specific requirements of the project.
- Concept Differentiation: Explain the difference between classification (what) and object detection (what and where).
- Responsible AI Application: Identify potential biases in facial analysis and suggest mitigation strategies according to Microsoft's guiding principles.
[!IMPORTANT] Success in the AI-900 exam requires knowing that Azure AI Vision is the primary multi-purpose service, while Azure AI Face is a specialized service for facial analysis.
Real-World Application
Computer vision is no longer theoretical; it is embedded in modern industry. Understanding these workloads allows for the following implementations:
- Manufacturing: Using Object Detection on assembly lines to identify defective parts in real-time.
- Healthcare: Applying Image Classification to X-rays or MRI scans to assist radiologists in identifying anomalies.
- Retail: Implementing OCR to automatically scan receipts for loyalty point programs.
- Public Safety: Utilizing Facial Detection in smart doorbells to alert homeowners when a person (rather than a pet) is at the door.
[!TIP] When studying for the exam, focus on the outputs of each workload. If the output includes a set of coordinates, it is almost certainly Object Detection or Image Segmentation.