Latest from the Databricks world.
Recent uploads from the Databricks team and a curated set of community creators. Filter by what you actually want to see.
Last week
17 videos
EventsDatabricks News: CLI v 1.0.0, AI-tools, Docker, DABs UI sync, mutators
The video demonstrates new Databricks features, including the GA release of CLI 1.0.0, UI sync for DABs, Python mutators for bundle extension, and new Docker image options for custom runtimes. It also covers serverless pipeline orchestration, enhanced autoscaling for Lakebase and apps, serverless interactive execution timeout, and auto-scoping for access tokens.
NewsThe Hidden Logic: How AI Transforms Your Data 🧐
AI models implicitly convert string-based categorical data, like sentiment (positive, negative, mixed), into numerical representations. This conversion is essential for performing mathematical operations, such as calculating an average sentiment.
NewsAI-Powered Data Cleaning in Databricks! 📊🤖
Databricks demonstrates using an AI assistant to clean data by providing an image of desired output. The AI transforms the existing data to match the structure and content shown in the attached image.
TutorialsIs Your Azure Databricks Storage Exposed? (Enable Firewall now)
The video demonstrates how to enable firewall support for an Azure Databricks workspace storage account, preventing public network access. It walks through creating private endpoints, an access connector, and then executing a PowerShell command to configure the firewall and network security perimeter.
NewsDatabricks: Future of Storage Security Revealed!
Databricks is onboarding existing workspace storage accounts with enabled firewalls to Network Security Perimeter (NSP). This allows users of Databricks serverless to leverage enhanced storage security.
TutorialsImport Local Files to Databricks Easily! ✨
Databricks Lake Designer now allows users to easily import local files by dragging and dropping them onto the canvas. This feature simplifies bringing personal datasets into Databricks for analysis, addressing the common need to use data not yet stored in the platform.
TutorialsPro Tip: Add Multiple Tables Fast! 🚀
Users can quickly add multiple tables to a canvas by dragging them directly from the Catalog Explorer left panel. This method streamlines the process of adding several tables from the same schema or catalog, avoiding the need to create individual source nodes.
TutorialsBuilding Real AI Agents (Fast!) | Microsoft Agent Framework Foundations | Part 2
The video demonstrates building AI agents using the Microsoft Agent Framework, covering basic agent setup, tool integration for external data, and managing conversation context and personalized interactions. It highlights the framework's simplified development, built-in telemetry, and modular design for creating robust AI agents.
TutorialsStop Leaving Your Azure Storage Open to the Public!
The video demonstrates how to enable firewall support for an Azure Databricks workspace storage account, preventing public network access. It walks through creating private endpoints, an access connector, and using a PowerShell command to configure the firewall.
NewsDeploying Azure Databricks with Terraform? Watch this first!
This video demonstrates how to deploy an Azure Databricks workspace using Terraform by cloning a provided script, configuring variables, and executing Terraform commands. It walks through setting up prerequisites, authenticating Azure CLI, and populating a Terraform variables file to successfully provision the workspace.
NewsPython Based Time Series Analytics on Databricks
Databricks partnered with AVL to create Impulse, an open-source Python framework for time series analytics on petabyte-scale automotive sensor data. Impulse standardizes raw sensor data into a silver layer data model, allowing engineers to query vast measurement data efficiently within the Databricks Lakehouse.
NewsDatabricks on Databricks: How Marketers Use Data 3x More with Genie, an AI Analytics Assistant
Databricks built "Marge," an AI analytics assistant powered by their Genie platform, to help its marketing team access and utilize data more efficiently. Marge provides conversational analytics by unifying marketing data in a lakehouse and offering governed, trusted insights in seconds, significantly reducing reliance on manual analyst reports.
NewsDatabricks Lakehouse for Automotive Data: How AVL Modernizes Vehicle Testing
AVL uses Databricks Lakehouse for Automotive Data to modernize vehicle testing by consolidating diverse, siloed data into a single platform. This enables engineers to efficiently analyze petabytes of data, accelerate development, and leverage AI for better, safer vehicles.
NewsEasy Migration from Postgres to Databricks Lakebase
The video demonstrates a tool for migrating existing PostgreSQL databases to Databricks Lakebase, highlighting potential compatibility issues like session state, extensions, and authentication that require architectural adjustments. It shows how to validate a PostgreSQL database for Lakebase compatibility and then perform a migration using a CLI tool, emphasizing the speed and ease of the process for straightforward databases.
NewsHow LLMs Understand your Prompts: Tokenization & Embeddings | Chapter 05
The video explains how Large Language Models (LLMs) understand text by converting it into numerical representations through tokenization and embeddings. It demonstrates how text is broken into tokens, assigned unique IDs, and then transformed into dense vectors (embeddings) that capture semantic meaning and positional information for LLM processing.
NewsAnthropic's SpaceX Deal, ClawPilot, and Databricks Agent-centric Cert | AI Newsround - May 2026
Anthropic signed a deal with SpaceX for AI supercomputing infrastructure, signaling the importance of compute supply in AI development. Google and Microsoft launched personal AI agents, Gemini Spark and Microsoft Scout, emphasizing ecosystem integration, trust, and governance.
Week of Jun 1
6 videos
NewsIs This the Future of Enterprise AI? | Microsoft Agent Framework Foundations | Part 1
The Microsoft Agent Framework, now in version one, unifies Semantic Kernel and Autogen into a single robust framework for enterprise AI solutions. It offers features like long-term memory, built-in guardrails, observability via OpenTelemetry, and integrated Azure Identity for secure and efficient agent development.
TutorialsTrace Any AI Agent with OTel, MLflow, and Unity Catalog
Databricks now allows sending OpenTelemetry traces from any AI agent to Unity Catalog, enabling end-to-end observability and governance within the Databricks Lakehouse. This integration facilitates cost-effective trace storage, offline analytics, production monitoring, and continuous agent evaluation using MLflow.
TutorialsSafe AI-Driven Development with Lakebase Branches
Databricks Lakebase branches enable instant, cost-efficient database branching using copy-on-write, allowing developers to test features in isolated environments without affecting production data. The video demonstrates creating and managing these branches via the Lakebase console and Databricks CLI, and shows how to integrate them into an agentic development workflow for safe AI-driven development.
NewsBeyond the Alert Queue: Modern AML Operations with Multi-Agent AI on Databricks
Databricks demonstrates a multi-agent AI solution for Anti-Money Laundering (AML) operations, significantly reducing false positives and accelerating investigation cycles from hours to minutes. The platform unifies siloed systems, employs specialized AI agents for analysis and recommendations, and offers AI-assisted SAR generation and executive-level reporting with natural language chat.
NewsWhen to choose CPU vs GPU: Databricks AI Runtime Explained
CPUs are best for data work like ETL, feature engineering, SQL, and classical machine learning, while GPUs are designed for deep learning workloads such as fine-tuning LLMs and training neural networks. Databricks AI Runtime simplifies GPU usage by providing serverless Nvidia GPUs, removing the need for manual infrastructure setup and allowing seamless transitions between CPU for data prep and GPU for model training within the Databricks environment.
TutorialsHow Large Language Models (LLMs) Work - Full Explanation | Chapter 04
Large Language Models (LLMs) are text-based neural networks trained on massive data to predict the next word (token), operating through tokenization, vector embeddings, and a transformer architecture. LLMs undergo pre-training, supervised fine-tuning, and reinforcement learning from human feedback to become helpful, safe, and aligned, with concepts like context length, knowledge cut-off, and hallucination defining their capabilities and limitations.
Week of May 25
4 videos
TutorialsThe New Databricks Lakeflow Designer Is a Game Changer!
Databricks Lakeflow Designer is a visual data preparation tool that allows users to create, add, and transform data using a no-code drag-and-drop UI or AI-powered Genie Code. The video demonstrates how to import data from various sources, profile data, perform complex transformations like data type conversions and sentiment analysis, and then deploy the resulting production-ready PySpark code for scheduling or integration into existing pipelines.
NewsTerraform AWS Databricks Deployment Guide!
The video demonstrates how to deploy an AWS Databricks workspace using a provided Terraform script. It covers prerequisites, AWS and Databricks authentication, variable configuration, and executing the Terraform commands to create the workspace.
TutorialsSecure Serverless: Azure Private Link Service Direct Connect
The video demonstrates how to set up Azure Private Link Service Direct Connect to enable secure, private connectivity from Databricks serverless compute to any private IP address, such as an on-premises database. It details the architecture, prerequisites, and a step-by-step demo of configuring the Private Link Service and a Databricks Network Connectivity Configuration (NCC) to connect to a MySQL instance.
TutorialsThe Future of Finance Operations Starts Here
The video demonstrates how Databricks' financial lakehouse solution addresses common finance data challenges like fragmentation and slow analysis. It showcases features like Unity Catalog for data governance, Lake Flow for pipeline management, and Genie Spaces for natural language querying of financial data.
Week of May 18
9 videos
NewsHow Neural Network works | Weights and Bias #dataengineering #neuralnetworks #genai
A neural network's neuron processes input signals by assigning weights to each, reflecting its importance (e.g., monthly income has a high positive weight, outstanding debts a negative weight). These weighted inputs are summed with a bias, and the result is passed through an activation function to produce an output decision.
TutorialsBuilding Trustworthy, High-Quality AI Agents with MLflow
Databricks' MLflow platform helps developers build trustworthy, high-quality AI agents by providing tools for end-to-end observability, evaluation, prompt management, and AI gateway governance. It demonstrates how MLflow facilitates tracing, expert feedback collection, automated issue detection with LLM judges, prompt optimization, and continuous monitoring throughout the agent development lifecycle.
TutorialsBuilding Enterprise-Ready Agents using Agent Bricks
Databricks Agent Bricks is a unified platform designed to help enterprises build and manage AI agents, addressing challenges like low-quality reasoning on proprietary data, lack of governance, and fragmented toolchains. It demonstrates how to create knowledge assistants for unstructured data and AI Genies for structured data, integrating with Unity Catalog for governance and MLflow for observability and evaluation.
TutorialsNeural Networks Explained - How They Work & Are Trained | Chapter 03
This video explains how artificial neural networks (ANNs) work, detailing the components of a neuron (inputs, weights, bias, activation function) and how they form layers in a network. It also covers the training process, including forward propagation, loss calculation, and backpropagation using gradient descent to adjust weights and biases.
NewsApache Iceberg V3 on Databricks: From Ingestion to Analytics
The video demonstrates Apache Iceberg v3 on Databricks, showcasing how its new variant column type natively handles semi-structured data and how row-level concurrency enables simultaneous data ingestion and corrections. It also highlights cross-platform data accessibility from open-source Spark via the Iceberg REST catalog, ensuring no vendor lock-in.
NewsDatabricks Genie for Marketing
Databricks' AI BI Genie allows non-technical marketers to converse with their Customer 360 data using natural language, enabling quick insights into marketing performance and campaign optimization. It helps identify issues like audience saturation and recommends budget reallocation by analyzing data and providing reasoning for its suggestions.
CommunityHow I Mastered System Design Interviews
This video teaches a six-step framework for mastering data engineering system design interviews, covering requirements gathering, pipeline design, data modeling, storage and file formats, data quality and observability, and pipeline resilience. It demonstrates how to apply this framework with practical examples and back-of-the-envelope calculations to justify design choices.
TutorialsAI Agents That Remember: Building Stateful Systems with Lakebase
AI agents require four types of memory (working, episodic, entity, procedural) to be truly intelligent and stateful, which traditional databases struggle to provide. Databricks Lakebase, built on Postgres, offers a unified OLTP and OLAP solution with features like serverless auto-scaling and Git-style branching to manage these complex memory needs for AI agents.
EventsDatabricks News: Lakeflow Designer, UV package manager, DABs templates, Genie scheduled tasks
Databricks introduces Lakeflow Designer for visual data preparation, though its generated code is messy; a workaround uses Genie to convert the visual workflow into clean PySpark/SQL notebooks. The UV package manager significantly speeds up package installations on Databricks serverless runtimes, and DABs templates allow for standardized, customizable Databricks Asset Bundles.
Week of May 11
12 videos
NewsGovern MCP servers in Databricks #databricks #mcp #aigovernance
Databricks Unity AI Gateway now governs MCP servers, centralizing their management alongside built-in foundation models and LLMs. This integration allows for easier governance and orchestration of various AI components and agents within Databricks.
NewsHow Suntory Turns Data into Faster Decisions with Databricks
Suntory uses Databricks to integrate diverse datasets, including internal sales, macroeconomic factors, and consumer behavior, into "Project Brain" for faster decision-making and product launches. The company also implements an all-employee upskilling program, "Manabi no Michi," to empower its workforce to leverage AI for improved performance and efficiency.
NewsAIA Group x Databricks: Turning Regulated Data into Real-Time Intelligence
AIA Group leverages Databricks to manage regulated data across 18 markets, addressing challenges like data residency and varying tech maturity with features like Unity Catalog for governance. The platform enables real-time intelligence for investment decisions, fraud detection, and personalized agent coaching, with future plans for conversational analytics and autonomous AI.
TutorialsConnect Google Sheets to Databricks
The Databricks Google Sheets add-in allows users to explore, import, and refresh governed data from the Databricks Lakehouse directly within Google Sheets. It demonstrates how to browse Unity Catalog, select tables or metric views, apply filters, schedule data refreshes, and use direct SQL queries with parameters.
NewsNo More Table Locks for Multi Statement Transactions #databricks #dataengineering #sql
Databricks now supports multi-table transactions, allowing changes to multiple tables within a single atomic transaction that rolls back all changes if any part fails. This feature, managed by Unity Catalog, prevents table locking during updates and supports up to 100 tables per transaction using a simple "BEGIN ATOMIC...END" syntax.
NewsMay 2026 Databricks Updates: No Code ETL, New GPUs and Death of the Dashboard
Databricks announced several updates including AI Prep Search for document chunking and vector database preparation, SQL vector functions for embedding mathematics, and the general availability of multi-table transactions. They also introduced Lakeflow Designer for visual, no-code data pipeline creation and updated their serverless GPU offerings to include H100s.
NewsAI for Data Intelligence Demo: Real-time fraud Detection with Databricks
Databricks demonstrates a real-time fraud detection solution for identifying mule accounts in banking, leveraging a unified data architecture, advanced AI/ML, and graph analytics to uncover complex fraud networks. The solution provides investigators with a single pane of glass application and AI-powered querying (Genie) to analyze risk scores, transaction patterns, and shared device access for efficient fraud investigation and reporting.
TutorialsHow to use Meta Conversions API on Databricks to activate first-party data
The Databricks Meta Conversions API app enables users to send conversion events from the Databricks Lakehouse directly to Meta Ads Manager. It provides a guided setup to connect Databricks to Meta using a pixel ID and access token, allowing for quick testing with sample data, deploying customizable notebooks, or setting up automated jobs for continuous data flow.
TutorialsMaking AI Feel Personal: User-Delegated Actions in MCP Agent Systems
The video demonstrates how to build an AI agent in Databricks that provides personalized responses by integrating user-delegated actions through Model Context Protocol (MCP) servers. It walks through setting up Unity Catalog functions, external MCP tools like web search, and custom MCP servers to access internal APIs, all while maintaining user context for relevant information retrieval.
TutorialsHow to Build an AI Security Governance Hub with Agent Bricks
Databricks Agent Bricks enables building an AI Security Governance Hub by transforming static security playbooks into adaptive multi-agent systems. The video demonstrates combining a knowledge assistant for unstructured documents and a Genie space for structured data into a supervisor agent, then details how to tune and monitor these agents for improved performance and data privacy.
NewsData + AI Executive Series: Fast 5 — Scaling Real-Time Ops with Databricks at Aer Lingus
Aer Lingus uses Databricks to scale real-time operations, particularly for making critical decisions in their operation control center regarding flight delays and cancellations. They are also exploring using "Agentic" to automate business case creation and review, aiming for a single, governed platform for reusable agents.
NewsData + Semantic Context = AI Ready | How TK Elevator Built It on Databricks
TK Elevator built an AI-ready data platform on Databricks Lakehouse, centralizing fragmented elevator data at scale. This platform integrates semantic context and expert knowledge, using Unity Catalog for governance and a medallion architecture to prepare data for AI applications.
