
☁️ AWS
Comprehensive AWS Machine Learning Engineer - Associate (MLA-C01) hive provides study notes, practice tests, flashcards, and hands-on labs, all supported by a personal AI tutor to help you master the AWS Machine Learning Engineer - Associate certification.
160 AI-generated study notes covering the full AWS Certified Machine Learning Engineer - Associate (MLA-C01) curriculum.
Amazon SageMaker AI built-in algorithms and when to apply them
925 words
Analyze model performance
845 words
Analyze model performance
1,145 words
Applying best practices to enable maintainable, scalable, and cost-effective ML solutions (for example, automatic scaling on SageMaker AI endpoints, dynamically adding Spot Instances, by using Amazon EC2 instances, by using Lambda behind the endpoints)
890 words
Applying continuous deployment flow structures to invoke pipelines (for example, Gitflow, GitHub Flow)
920 words
Assessing available data and problem complexity to determine the feasibility of an ML solution
945 words
Assessing tradeoffs between model performance, training time, and cost
925 words
Automating the provisioning of compute resources, including communication between stacks (for example, by using CloudFormation, AWS CDK)
925 words
Automation and integration of data ingestion with orchestration services
875 words
AWS deployment services (for example, Amazon SageMaker AI)
925 words
AWS storage options, including use cases and tradeoffs
920 words
Benefits of regularization techniques (for example, dropout, weight decay, L1 and L2)
945 words
Building and integrating mechanisms to retrain models
945 words
Building and maintaining containers (for example, Amazon Elastic Container Registry [Amazon ECR], Amazon EKS, Amazon ECS, by using bring your own container [BYOC] with SageMaker AI)
890 words
Building VPCs, subnets, and security groups to securely isolate ML systems
920 words
Capabilities and appropriate uses of ML algorithms to solve business problems
890 words
Capabilities and quotas for AWS CodePipeline, AWS CodeBuild, and AWS CodeDeploy
890 words
Capabilities of cost analysis tools (for example, AWS Cost Explorer, AWS Billing and Cost Management, AWS Trusted Advisor)
1,085 words
Choose a modeling approach
820 words
Choose a modeling approach
895 words
Choosing appropriate data formats (for example, Parquet, JSON, CSV, ORC) based on data access patterns
924 words
Choosing built-in algorithms, foundation models, and solution templates (for example, in SageMaker JumpStart and Amazon Bedrock)
895 words
Choosing model deployment strategies (for example, real time, batch)
920 words
Choosing specific metrics for auto scaling (for example, model latency, CPU utilization, invocations per instance)
875 words
Choosing the appropriate compute environment for training and inference based on requirements (for example, GPU or CPU specifications, processor family, networking bandwidth)
850 words
CI/CD principles and how they fit into ML workflows
980 words
Combining multiple training models to improve performance (for example, ensembling, stacking, boosting)
1,050 words
Comparing and selecting appropriate ML models or algorithms to solve specific problems
1,150 words
Configuring and troubleshooting CodeBuild, CodeDeploy, and CodePipeline, including stages
945 words
Configuring and using tools to troubleshoot and analyze resources (for example, CloudWatch Logs, CloudWatch alarms)
1,050 words
Configuring data to load into the model training resource (for example, Amazon EFS, Amazon FSx)
948 words
Configuring IAM policies and roles for users and applications that interact with ML systems
985 words
Configuring least privilege access to ML artifacts
948 words
Configuring SageMaker AI endpoints within the VPC network
1,050 words
Configuring training and inference jobs (for example, by using Amazon EventBridge rules, SageMaker Pipelines, CodePipeline)
1,050 words
Containerization concepts and AWS container services
925 words
Controls for network access to ML resources
895 words
Convergence issues
1,050 words
Cost tracking and allocation techniques (for example, resource tagging)
920 words
Create and script infrastructure based on existing architecture and requirements
865 words
Create and script infrastructure based on existing architecture and requirements
920 words
Creating and managing features by using AWS tools (for example, SageMaker Feature Store)
945 words
Creating automated tests in CI/CD pipelines (for example, integration tests, unit tests, end-to-end tests)
875 words
Creating CloudTrail trails
925 words
Data annotation and labeling services that create high-quality labeled datasets
945 words
Data classification, anonymization, and masking
890 words
Data cleaning and transformation techniques (for example, detecting and treating outliers, imputing missing data, combining, deduplication)
1,055 words
Data formats and ingestion mechanisms (for example, validated and non-validated formats, Apache Parquet, JSON, CSV, Apache ORC, Apache Avro, RecordIO)
1,085 words
Deploying and hosting models by using the SageMaker AI SDK
940 words
Deployment best practices (for example, versioning, rollback strategies)
1,050 words
Showing 50 of 160 study notes. View all →
Try 5 sample questions from a bank of 725.
Q1.An ML engineer is building a loan approval model and suspects that the historical training dataset exhibits **selection bias** because applicants from a specific demographic group are significantly underrepresented compared to the general population. Which approach using Amazon SageMaker Clarify would best allow the engineer to identify this bias before training and mitigate its impact?
Correct: A
Q2.An ML engineer is tasked with building a generative AI solution that requires fine-tuning an open-source foundation model on a proprietary dataset. The project requirements specify that the team must have full control over the underlying compute instances for hosting the model to meet specific latency and cost-optimization targets. Furthermore, the model must be deployed as a dedicated endpoint within a private Virtual Private Cloud (VPC). Which service is the most appropriate choice for these requirements?
Correct: B
Q3.A Machine Learning Engineer implements a cost-tracking strategy to attribute Amazon SageMaker expenses to different business units within the organization. The following chart illustrates the categorized spending report generated after the implementation. Which combination of actions was required to produce this categorized report in AWS Cost Explorer?
Correct: A
Q4.An ML engineer is training a deep neural network and decides to decrease the **batch size** from $1024$ to $32$. Which of the following best explains the impact of this change on the gradient descent optimization process?
Correct: B
Q5.When comparing validated data formats (such as Apache Avro or Apache Parquet) to non-validated data formats (such as CSV or JSON) for data ingestion, which statement best explains the primary advantage of using a validated format?
Correct: B
Want more? Clone this hive to access all 725 questions, timed exams, and AI tutoring. Start studying →
725 flashcard decks for spaced-repetition study.
Sample:
Which AWS services are used for **batch** versus **real-time** data ingestion?
Sample:
**CSV vs. JSON**
Sample:
**Amazon Simple Storage Service (S3)**
Sample:
**Amazon Kinesis Data Streams**
Sample:
**Amazon S3 Transfer Acceleration**
Sample:
**Amazon S3 (Simple Storage Service)**
Clone this hive to get full access to all 725 practice questions, 11 timed mock exams, study notes, flashcards, and a personal AI tutor — completely free.
Start Studying — Free