Curriculum Overview: Identifying Document Processing Workloads

This curriculum focuses on the specific domain of Document Processing within the Microsoft Azure AI Fundamentals (AI-900) framework. It bridges the gap between basic Computer Vision (seeing text) and Natural Language Processing (understanding language) by focusing on the extraction of structured data from physical and digital documents.

Prerequisites

Before engaging with this curriculum, students should have a baseline understanding of the following:

Cloud Fundamentals: Basic knowledge of cloud computing and the Microsoft Azure ecosystem.
General AI Concepts: Familiarity with the definitions of Artificial Intelligence and Machine Learning.
Computer Vision Basics: A high-level understanding of how AI "sees" pixels, specifically the concept of Optical Character Recognition (OCR).

Module Breakdown

Module	Topic	Difficulty	Focus Area
1	Foundations of OCR	Beginner	General text extraction from images (signs, labels).
2	Intelligent Document Processing (IDP)	Intermediate	Moving from raw text to structured data and relationships.
3	Azure Document Intelligence	Intermediate	Using pre-built models for invoices, IDs, and forms.
4	Workload Differentiation	Advanced	Distinguishing document processing from NLP and Computer Vision.

Learning Objectives per Module

Module 1: Foundations of OCR

Identify the features of Read OCR for fast text extraction from "in the wild" images.
Understand the difference between synchronous (real-time) and asynchronous (high-volume) APIs.

Module 2: Intelligent Document Processing (IDP)

Explain how IDP serves as the "next-level cousin" to OCR.
Identify the ability of AI to recognize document structure, such as tables, key-value pairs, and entities.

Module 3: Azure Document Intelligence

Recognize the capabilities of the Document Intelligence Studio.
Analyze how the service handles specific document types like invoices, receipts, and social security cards.

Module 4: Workload Differentiation

Identify when a workload is specifically a "document processing" task versus a "natural language processing" task.
Understand that document processing is designed for high-volume data extraction rather than content generation.

Visual Overview

The Document Processing Pipeline

Loading Diagram...

Foundational Architecture

Compiling TikZ diagram…

⏳

Running TeX engine…

This may take a few seconds

Success Metrics

To demonstrate mastery of this curriculum, the learner must be able to:

Select the correct tool: Given a scenario (e.g., "Extracting total cost from 5,000 invoices"), choose Document Intelligence over General Computer Vision.
Distinguish API Types: Identify when to use an asynchronous API (large-scale reports/books) versus a synchronous API (immediate labels/signs).
Define Output: Correctfully identify that Document Processing results in structured data (keys and values) rather than just a string of words.
Differentiate Workloads: Correctfully answer practice questions that contrast Document Intelligence with Generative AI or Content Moderation.

Real-World Application

Document processing workloads are critical in modernizing legacy industries. Examples include:

Financial Services: Automating invoice entry into ERP systems to reduce manual typing errors.
Government/Healthcare: Digitizing identification documents (Social Security cards, Driver's Licenses) and extracting specific fields for verification.
Publishing: Converting high volumes of scanned books or archival reports into searchable, structured databases.

[!IMPORTANT] Remember: Document Intelligence is designed to process and extract, whereas Generative AI is designed to create. Knowing this distinction is key to passing the AI-900 exam.