Course Overview

In this course, you will gain theoretical and practical knowledge of Apache Spark’s architecture and its application to machine learning workloads within Databricks. You will learn when to use Spark for data preparation, model training, and deployment, while also gaining hands-on experience with Spark ML and pandas APIs on Spark.

This course will introduce you to advanced concepts like hyperparameter tuning and scaling Optuna with Spark. This course will use features and concepts introduced in the associate course such as MLflow and Unity Catalog for comprehensive model packaging and governance.

What are the skills covered

  • Machine Learning Development with Spark
  • Distributed Model Tuning on Databricks
  • Deploying Machine Learning Models with Spark
  • Pandas on Spark

Who should attend this course

  • Everyone who is interested

Course Curriculum

What are the Prerequisites

The content was developed for participants with these skills/knowledge/abilities:

  • A beginner-level understanding of Python.
  • Basic understanding of DS/ML concepts (e.g. classification and regression models), common model metrics (e.g. F1-score), and Python libraries (e.g. scikit-learn and XGBoost).

Download Syllabus

Course Modules

Request More Information

Training Options

Intake: Available Upon Request
Duration: 1 Day
Guaranteed: TBC
Modality: VILT
Price:

RM4,275.00Enroll Now

RM3,375.00Enroll Now

Exam:
[yith_ywraq_button_quote product="144811"]
[yith_ywraq_button_quote product="144812"]

Exam & Certification

Databricks Certified Machine Learning Professional exam.

The Databricks Certified Machine Learning Professional certification exam assesses an individual’s ability to design, implement, and manage enterprise-scale machine learning solutions using advanced Databricks platform capabilities. This includes proficiency in building scalable ML pipelines with SparkML, implementing distributed training and hyperparameter tuning, leveraging advanced MLflow features, and utilizing Feature Store concepts for automated feature pipelines.

The certification exam evaluates expertise in MLOps practices, including testing strategies, environment management with Databricks Asset Bundles, automated retraining workflows, and monitoring using Lakehouse Monitoring for drift detection.

Additionally, test-takers are assessed on their ability to implement deployment strategies, custom model serving, and model rollout management. Individuals who pass this certification exam can be expected to perform advanced machine learning engineering tasks at enterprise scale, implementing production-ready ML systems with comprehensive monitoring, testing, and deployment practices using the full feature set of Databricks.

This exam covers:

  1. Model Development – 44%
  2. ML Ops – 44%
  3. Model Deployment – 12%

Training & Certification Guide

Frequently Asked Questions