Skip to content
brickster.ai
All videos
tutorialsDatabricks·July 19, 2022

Scalable XGBoost on GPU Clusters

Description

XGBoost is a popular open-source implementation of gradient boosting tree algorithms. In this talk, we walk through some of the new features in XGBoost that help us train better models, and explain how to scale up the pipeline to larger datasets with GPU clusters. It is challenging to train gradient boosting models with the growing size and complexity of data. The latest XGBoost introduces categorical data support to help data scientists work with non-numerical data without the need for encoding. The new XGBoost could train multi-output models to handle datasets with non-exclusive class labels and multi-target regression. XGBoost has also introduced a new AUC implementation that supports more model types and features a robust approximation in distributed environments. The latest XGBoost has significantly improved its built-in GPU support for scalability and performance. The data loading and processing have been improved for increased memory efficiency, enabling users to handle larger datasets. GPU-based model training is over 2x faster compared to past versions. The performance improvement has also been extended to model explanation. XGBoost added GPU-based SHAP value comput

Description from YouTube. Full content on the video page.

More from Databricks