Curriculum Overview: Identifying Classification Machine Learning Scenarios
Identify classification machine learning scenarios
Curriculum Overview: Identifying Classification Machine Learning Scenarios
This document outlines the educational path for mastering classification within the Microsoft Azure AI Fundamentals (AI-900) framework. Classification is a supervised learning technique used to predict discrete categories or classes for a given input.
Prerequisites
Before engaging with classification scenarios, learners should have a foundational understanding of the following:
- Basic AI Concepts: Understanding the difference between Artificial Intelligence and Machine Learning.
- Supervised Learning Fundamentals: Knowledge of how models learn from labeled data.
- Dataset Terminology: Familiarity with Features (input variables) and Labels (the target outcome).
- Training vs. Validation: Understanding that models are trained on one subset of data and tested on another to ensure accuracy.
Module Breakdown
| Module | Focus Area | Difficulty |
|---|---|---|
| 1. The Classification Core | Defining discrete vs. continuous outcomes; Binary vs. Multiclass. | Beginner |
| 2. Scenarios & Use Cases | Identifying real-world applications (Fraud, Spam, Medical). | Intermediate |
| 3. Common Algorithms | Logistic Regression, Decision Trees, and One-vs-Rest (OVR). | Intermediate |
| 4. Azure Machine Learning | Using Automated ML (AutoML) for classification tasks. | Advanced |
Learning Objectives per Module
By the end of this curriculum, learners will be able to:
Module 1: The Classification Core
- Distinguish between Classification (predicting a category) and Regression (predicting a numeric value).
- Identify Binary Classification scenarios (two possible outcomes, e.g., Yes/No).
- Identify Multiclass Classification scenarios (three or more possible outcomes).
Module 2: Scenarios & Use Cases
- Recognize classification in financial services (e.g., differentiating between fraudulent and genuine transactions).
- Identify document processing tasks, such as classifying emails as "Spam" or "Not Spam".
Module 3: Common Algorithms
- Understand that Logistic Regression is used to predict probabilities between two classes.
- Explain the One-vs-Rest (OVR) approach for handling multiclass problems using binary classifiers.
Module 4: Azure Machine Learning
- Describe how Automated Machine Learning (AutoML) identifies the best classification algorithm by iterating through multiple models.
Visual Anchors
Classification Decision Path
Decision Boundary Visualization
In classification, the model attempts to find a "boundary" that separates different classes in the feature space.
Success Metrics
To demonstrate mastery of this topic, the learner must be able to:
- Correctly Categorize Tasks: Given a business problem (e.g., "Will this customer churn?"), identify it as a classification task.
- Algorithm Selection: Choose Logistic Regression when a probability for a binary outcome is required.
- Approach Validation: Explain why One-vs-Rest is necessary when applying binary logic to a three-category problem.
- AutoML Proficiency: Explain that AutoML automates hyperparameter tuning and algorithm selection specifically for supervised tasks like classification.
Real-World Application
Classification is the backbone of automated decision-making in various industries:
[!IMPORTANT] Financial Services: Banks use classification to flag transactions. Features might include transaction amount, location, and time; the Label is "Fraudulent" or "Legitimate."
- Manufacturing: Predicting if a machine part is "Functional" or "Failing" (Binary).
- Healthcare: Analyzing medical images to classify a tumor as "Benign" or "Malignant."
- Retail: Customer segmentation where users are classified into groups like "High Spender," "Occasional," or "Window Shopper" (Multiclass).