Skip to content
brickster.ai
All videos
newsDatabricks·September 1, 2021

Let’s Dumb-Proof Data Pipelines

Description

Developing and deploying data pipelines in production is easy. Maintaining data pipelines is hard because most often it’s not the same engineer or team responsible for operating and maintaining data pipelines in production. If your data pipelines are not parameterized and configurable, you need to recompile your source code and go through your release process even for simple configuration changes. Making your data pipelines configurable is not enough. Bad user input can result in many classes of issues such as data loss, data corruption. data correctness, etc. In this talk, you’ll walk away with techniques to make your data pipelines dumb-proof. 1. Why do you need to make your data pipelines configurable? 2. How to seamlessly promote your data pipelines from one environment to another without making any source code changes? 3. How to reconfigure your data pipelines in production without recompiling the ETL source code? 4. What are the Pros and Cons of using Databricks Notebook widgets for configuring your data pipelines 5. How to externalize configurations from your ETL source code and how to read and parse configuration files 6. Finally, you’ll learn how to take it to next level

Description from YouTube. Full content on the video page.

More from Databricks