Skip to content
brickster.ai
All videos
newsDatabricks·September 2, 2021

Modernize Your Analytics Workloads for Apache Spark 3.0 and Beyond

Description

Apache Spark 3.0 has been out for almost a year, and it’s a safe bet that you’re running at least some production workloads against it today. However, many production Spark jobs may have evolved over the better part of a decade, and your code, configuration, and architecture may not be taking full advantage of all that Spark 3 has to offer. In this talk, we’ll discuss changes you might need to make to legacy applications in order to make the most of Apache Spark 3.0. You’ll learn some common sources of technical debt in mature Apache Spark applications and how to pay them down, when to replace hand-tuned configurations with Adaptive Query Execution, how to ensure that your queries can take advantage of columnar processing, including execution on GPUs, and how your Spark analytics workloads can directly incorporate accelerated ML training. We’ll provide several concrete examples taken from an end-to-end analytics application addressing customer churn modeling, recent experience modernizing Apache Spark applications, and lessons learned while maintaining a library of Apache Spark extensions across three major versions of Apache Spark. Connect with us: Website: https://databricks.c

Description from YouTube. Full content on the video page.

More from Databricks