Skip to content
brickster.ai
All videos
newsDatabricks·July 27, 2023

Managing Data Encryption in Apache Spark™

Description

Sensitive data sets can be encrypted directly by new Apache Spark™ versions (3.2 and higher). Setting several configuration parameters and DataFrame options will trigger the Apache Parquet modular encryption mechanism that protects select columns with column-specific keys. The upcoming Spark 3.4 version will also support uniform encryption, where all DataFrame columns are encrypted with the same key. Spark data encryption is already leveraged by a number of companies to protect personal or business confidential data in their production environments. The main integration effort is focused on key access control and on building a Spark/Parquet plug-in code that can interact with company’s key management service (KMS). In this session, we will briefly cover the basics of Spark/Parquet encryption usage, and dive into the details of encryption key management that will help in integrating this Spark data protection mechanism in your deployment. You will learn how to run a HelloWorld encryption sample, and how to extend it into a real world production code integrated with your organization’s KMS and access control policies. We will talk about the standard envelope encryption approach to

Description from YouTube. Full content on the video page.

More from Databricks