Latest from the Databricks world.
Recent uploads from the Databricks team and a curated set of community creators. Filter by what you actually want to see.
Last week
17 videos
EventsDatabricks News: CLI v 1.0.0, AI-tools, Docker, DABs UI sync, mutators
The video demonstrates new Databricks features, including the GA release of CLI 1.0.0, UI sync for DABs, Python mutators for bundle extension, and new Docker image options for custom runtimes. It also covers serverless pipeline orchestration, enhanced autoscaling for Lakebase and apps, serverless interactive execution timeout, and auto-scoping for access tokens.
NewsThe Hidden Logic: How AI Transforms Your Data 🧐
AI models implicitly convert string-based categorical data, like sentiment (positive, negative, mixed), into numerical representations. This conversion is essential for performing mathematical operations, such as calculating an average sentiment.
NewsAI-Powered Data Cleaning in Databricks! 📊🤖
Databricks demonstrates using an AI assistant to clean data by providing an image of desired output. The AI transforms the existing data to match the structure and content shown in the attached image.
TutorialsIs Your Azure Databricks Storage Exposed? (Enable Firewall now)
The video demonstrates how to enable firewall support for an Azure Databricks workspace storage account, preventing public network access. It walks through creating private endpoints, an access connector, and then executing a PowerShell command to configure the firewall and network security perimeter.
NewsDatabricks: Future of Storage Security Revealed!
Databricks is onboarding existing workspace storage accounts with enabled firewalls to Network Security Perimeter (NSP). This allows users of Databricks serverless to leverage enhanced storage security.
TutorialsImport Local Files to Databricks Easily! ✨
Databricks Lake Designer now allows users to easily import local files by dragging and dropping them onto the canvas. This feature simplifies bringing personal datasets into Databricks for analysis, addressing the common need to use data not yet stored in the platform.
TutorialsPro Tip: Add Multiple Tables Fast! 🚀
Users can quickly add multiple tables to a canvas by dragging them directly from the Catalog Explorer left panel. This method streamlines the process of adding several tables from the same schema or catalog, avoiding the need to create individual source nodes.
TutorialsBuilding Real AI Agents (Fast!) | Microsoft Agent Framework Foundations | Part 2
The video demonstrates building AI agents using the Microsoft Agent Framework, covering basic agent setup, tool integration for external data, and managing conversation context and personalized interactions. It highlights the framework's simplified development, built-in telemetry, and modular design for creating robust AI agents.
TutorialsStop Leaving Your Azure Storage Open to the Public!
The video demonstrates how to enable firewall support for an Azure Databricks workspace storage account, preventing public network access. It walks through creating private endpoints, an access connector, and using a PowerShell command to configure the firewall.
NewsDeploying Azure Databricks with Terraform? Watch this first!
This video demonstrates how to deploy an Azure Databricks workspace using Terraform by cloning a provided script, configuring variables, and executing Terraform commands. It walks through setting up prerequisites, authenticating Azure CLI, and populating a Terraform variables file to successfully provision the workspace.
NewsPython Based Time Series Analytics on Databricks
Databricks partnered with AVL to create Impulse, an open-source Python framework for time series analytics on petabyte-scale automotive sensor data. Impulse standardizes raw sensor data into a silver layer data model, allowing engineers to query vast measurement data efficiently within the Databricks Lakehouse.
NewsDatabricks on Databricks: How Marketers Use Data 3x More with Genie, an AI Analytics Assistant
Databricks built "Marge," an AI analytics assistant powered by their Genie platform, to help its marketing team access and utilize data more efficiently. Marge provides conversational analytics by unifying marketing data in a lakehouse and offering governed, trusted insights in seconds, significantly reducing reliance on manual analyst reports.
NewsDatabricks Lakehouse for Automotive Data: How AVL Modernizes Vehicle Testing
AVL uses Databricks Lakehouse for Automotive Data to modernize vehicle testing by consolidating diverse, siloed data into a single platform. This enables engineers to efficiently analyze petabytes of data, accelerate development, and leverage AI for better, safer vehicles.
NewsEasy Migration from Postgres to Databricks Lakebase
The video demonstrates a tool for migrating existing PostgreSQL databases to Databricks Lakebase, highlighting potential compatibility issues like session state, extensions, and authentication that require architectural adjustments. It shows how to validate a PostgreSQL database for Lakebase compatibility and then perform a migration using a CLI tool, emphasizing the speed and ease of the process for straightforward databases.
NewsHow LLMs Understand your Prompts: Tokenization & Embeddings | Chapter 05
The video explains how Large Language Models (LLMs) understand text by converting it into numerical representations through tokenization and embeddings. It demonstrates how text is broken into tokens, assigned unique IDs, and then transformed into dense vectors (embeddings) that capture semantic meaning and positional information for LLM processing.
NewsAnthropic's SpaceX Deal, ClawPilot, and Databricks Agent-centric Cert | AI Newsround - May 2026
Anthropic signed a deal with SpaceX for AI supercomputing infrastructure, signaling the importance of compute supply in AI development. Google and Microsoft launched personal AI agents, Gemini Spark and Microsoft Scout, emphasizing ecosystem integration, trust, and governance.
Week of Jun 1
6 videos
NewsIs This the Future of Enterprise AI? | Microsoft Agent Framework Foundations | Part 1
The Microsoft Agent Framework, now in version one, unifies Semantic Kernel and Autogen into a single robust framework for enterprise AI solutions. It offers features like long-term memory, built-in guardrails, observability via OpenTelemetry, and integrated Azure Identity for secure and efficient agent development.
TutorialsTrace Any AI Agent with OTel, MLflow, and Unity Catalog
Databricks now allows sending OpenTelemetry traces from any AI agent to Unity Catalog, enabling end-to-end observability and governance within the Databricks Lakehouse. This integration facilitates cost-effective trace storage, offline analytics, production monitoring, and continuous agent evaluation using MLflow.
TutorialsSafe AI-Driven Development with Lakebase Branches
Databricks Lakebase branches enable instant, cost-efficient database branching using copy-on-write, allowing developers to test features in isolated environments without affecting production data. The video demonstrates creating and managing these branches via the Lakebase console and Databricks CLI, and shows how to integrate them into an agentic development workflow for safe AI-driven development.
NewsBeyond the Alert Queue: Modern AML Operations with Multi-Agent AI on Databricks
Databricks demonstrates a multi-agent AI solution for Anti-Money Laundering (AML) operations, significantly reducing false positives and accelerating investigation cycles from hours to minutes. The platform unifies siloed systems, employs specialized AI agents for analysis and recommendations, and offers AI-assisted SAR generation and executive-level reporting with natural language chat.
NewsWhen to choose CPU vs GPU: Databricks AI Runtime Explained
CPUs are best for data work like ETL, feature engineering, SQL, and classical machine learning, while GPUs are designed for deep learning workloads such as fine-tuning LLMs and training neural networks. Databricks AI Runtime simplifies GPU usage by providing serverless Nvidia GPUs, removing the need for manual infrastructure setup and allowing seamless transitions between CPU for data prep and GPU for model training within the Databricks environment.
TutorialsHow Large Language Models (LLMs) Work - Full Explanation | Chapter 04
Large Language Models (LLMs) are text-based neural networks trained on massive data to predict the next word (token), operating through tokenization, vector embeddings, and a transformer architecture. LLMs undergo pre-training, supervised fine-tuning, and reinforcement learning from human feedback to become helpful, safe, and aligned, with concepts like context length, knowledge cut-off, and hallucination defining their capabilities and limitations.
Week of May 25
4 videos
TutorialsThe New Databricks Lakeflow Designer Is a Game Changer!
Databricks Lakeflow Designer is a visual data preparation tool that allows users to create, add, and transform data using a no-code drag-and-drop UI or AI-powered Genie Code. The video demonstrates how to import data from various sources, profile data, perform complex transformations like data type conversions and sentiment analysis, and then deploy the resulting production-ready PySpark code for scheduling or integration into existing pipelines.
NewsTerraform AWS Databricks Deployment Guide!
The video demonstrates how to deploy an AWS Databricks workspace using a provided Terraform script. It covers prerequisites, AWS and Databricks authentication, variable configuration, and executing the Terraform commands to create the workspace.
TutorialsSecure Serverless: Azure Private Link Service Direct Connect
The video demonstrates how to set up Azure Private Link Service Direct Connect to enable secure, private connectivity from Databricks serverless compute to any private IP address, such as an on-premises database. It details the architecture, prerequisites, and a step-by-step demo of configuring the Private Link Service and a Databricks Network Connectivity Configuration (NCC) to connect to a MySQL instance.
TutorialsThe Future of Finance Operations Starts Here
The video demonstrates how Databricks' financial lakehouse solution addresses common finance data challenges like fragmentation and slow analysis. It showcases features like Unity Catalog for data governance, Lake Flow for pipeline management, and Genie Spaces for natural language querying of financial data.
Week of May 18
9 videos
NewsHow Neural Network works | Weights and Bias #dataengineering #neuralnetworks #genai
A neural network's neuron processes input signals by assigning weights to each, reflecting its importance (e.g., monthly income has a high positive weight, outstanding debts a negative weight). These weighted inputs are summed with a bias, and the result is passed through an activation function to produce an output decision.
TutorialsBuilding Trustworthy, High-Quality AI Agents with MLflow
Databricks' MLflow platform helps developers build trustworthy, high-quality AI agents by providing tools for end-to-end observability, evaluation, prompt management, and AI gateway governance. It demonstrates how MLflow facilitates tracing, expert feedback collection, automated issue detection with LLM judges, prompt optimization, and continuous monitoring throughout the agent development lifecycle.
TutorialsBuilding Enterprise-Ready Agents using Agent Bricks
Databricks Agent Bricks is a unified platform designed to help enterprises build and manage AI agents, addressing challenges like low-quality reasoning on proprietary data, lack of governance, and fragmented toolchains. It demonstrates how to create knowledge assistants for unstructured data and AI Genies for structured data, integrating with Unity Catalog for governance and MLflow for observability and evaluation.
TutorialsNeural Networks Explained - How They Work & Are Trained | Chapter 03
This video explains how artificial neural networks (ANNs) work, detailing the components of a neuron (inputs, weights, bias, activation function) and how they form layers in a network. It also covers the training process, including forward propagation, loss calculation, and backpropagation using gradient descent to adjust weights and biases.
NewsApache Iceberg V3 on Databricks: From Ingestion to Analytics
The video demonstrates Apache Iceberg v3 on Databricks, showcasing how its new variant column type natively handles semi-structured data and how row-level concurrency enables simultaneous data ingestion and corrections. It also highlights cross-platform data accessibility from open-source Spark via the Iceberg REST catalog, ensuring no vendor lock-in.
NewsDatabricks Genie for Marketing
Databricks' AI BI Genie allows non-technical marketers to converse with their Customer 360 data using natural language, enabling quick insights into marketing performance and campaign optimization. It helps identify issues like audience saturation and recommends budget reallocation by analyzing data and providing reasoning for its suggestions.
CommunityHow I Mastered System Design Interviews
This video teaches a six-step framework for mastering data engineering system design interviews, covering requirements gathering, pipeline design, data modeling, storage and file formats, data quality and observability, and pipeline resilience. It demonstrates how to apply this framework with practical examples and back-of-the-envelope calculations to justify design choices.
TutorialsAI Agents That Remember: Building Stateful Systems with Lakebase
AI agents require four types of memory (working, episodic, entity, procedural) to be truly intelligent and stateful, which traditional databases struggle to provide. Databricks Lakebase, built on Postgres, offers a unified OLTP and OLAP solution with features like serverless auto-scaling and Git-style branching to manage these complex memory needs for AI agents.
EventsDatabricks News: Lakeflow Designer, UV package manager, DABs templates, Genie scheduled tasks
Databricks introduces Lakeflow Designer for visual data preparation, though its generated code is messy; a workaround uses Genie to convert the visual workflow into clean PySpark/SQL notebooks. The UV package manager significantly speeds up package installations on Databricks serverless runtimes, and DABs templates allow for standardized, customizable Databricks Asset Bundles.
Week of May 11
15 videos
NewsGovern MCP servers in Databricks #databricks #mcp #aigovernance
Databricks Unity AI Gateway now governs MCP servers, centralizing their management alongside built-in foundation models and LLMs. This integration allows for easier governance and orchestration of various AI components and agents within Databricks.
NewsHow Suntory Turns Data into Faster Decisions with Databricks
Suntory uses Databricks to integrate diverse datasets, including internal sales, macroeconomic factors, and consumer behavior, into "Project Brain" for faster decision-making and product launches. The company also implements an all-employee upskilling program, "Manabi no Michi," to empower its workforce to leverage AI for improved performance and efficiency.
NewsAIA Group x Databricks: Turning Regulated Data into Real-Time Intelligence
AIA Group leverages Databricks to manage regulated data across 18 markets, addressing challenges like data residency and varying tech maturity with features like Unity Catalog for governance. The platform enables real-time intelligence for investment decisions, fraud detection, and personalized agent coaching, with future plans for conversational analytics and autonomous AI.
TutorialsConnect Google Sheets to Databricks
The Databricks Google Sheets add-in allows users to explore, import, and refresh governed data from the Databricks Lakehouse directly within Google Sheets. It demonstrates how to browse Unity Catalog, select tables or metric views, apply filters, schedule data refreshes, and use direct SQL queries with parameters.
NewsNo More Table Locks for Multi Statement Transactions #databricks #dataengineering #sql
Databricks now supports multi-table transactions, allowing changes to multiple tables within a single atomic transaction that rolls back all changes if any part fails. This feature, managed by Unity Catalog, prevents table locking during updates and supports up to 100 tables per transaction using a simple "BEGIN ATOMIC...END" syntax.
NewsMay 2026 Databricks Updates: No Code ETL, New GPUs and Death of the Dashboard
Databricks announced several updates including AI Prep Search for document chunking and vector database preparation, SQL vector functions for embedding mathematics, and the general availability of multi-table transactions. They also introduced Lakeflow Designer for visual, no-code data pipeline creation and updated their serverless GPU offerings to include H100s.
NewsAI for Data Intelligence Demo: Real-time fraud Detection with Databricks
Databricks demonstrates a real-time fraud detection solution for identifying mule accounts in banking, leveraging a unified data architecture, advanced AI/ML, and graph analytics to uncover complex fraud networks. The solution provides investigators with a single pane of glass application and AI-powered querying (Genie) to analyze risk scores, transaction patterns, and shared device access for efficient fraud investigation and reporting.
TutorialsHow to use Meta Conversions API on Databricks to activate first-party data
The Databricks Meta Conversions API app enables users to send conversion events from the Databricks Lakehouse directly to Meta Ads Manager. It provides a guided setup to connect Databricks to Meta using a pixel ID and access token, allowing for quick testing with sample data, deploying customizable notebooks, or setting up automated jobs for continuous data flow.
TutorialsMaking AI Feel Personal: User-Delegated Actions in MCP Agent Systems
The video demonstrates how to build an AI agent in Databricks that provides personalized responses by integrating user-delegated actions through Model Context Protocol (MCP) servers. It walks through setting up Unity Catalog functions, external MCP tools like web search, and custom MCP servers to access internal APIs, all while maintaining user context for relevant information retrieval.
TutorialsHow to Build an AI Security Governance Hub with Agent Bricks
Databricks Agent Bricks enables building an AI Security Governance Hub by transforming static security playbooks into adaptive multi-agent systems. The video demonstrates combining a knowledge assistant for unstructured documents and a Genie space for structured data into a supervisor agent, then details how to tune and monitor these agents for improved performance and data privacy.
NewsData + AI Executive Series: Fast 5 — Scaling Real-Time Ops with Databricks at Aer Lingus
Aer Lingus uses Databricks to scale real-time operations, particularly for making critical decisions in their operation control center regarding flight delays and cancellations. They are also exploring using "Agentic" to automate business case creation and review, aiming for a single, governed platform for reusable agents.
NewsData + Semantic Context = AI Ready | How TK Elevator Built It on Databricks
TK Elevator built an AI-ready data platform on Databricks Lakehouse, centralizing fragmented elevator data at scale. This platform integrates semantic context and expert knowledge, using Unity Catalog for governance and a medallion architecture to prepare data for AI applications.
EventsBuilding Trustworthy, High-Quality AI Agents with MLflow
MLflow provides a comprehensive platform for building, evaluating, and deploying high-quality AI agents, offering tools for observability, automated evaluation, prompt optimization, and production monitoring. It enables developers to streamline the agent development lifecycle, from prototyping and testing with human and AI judges to fixing issues and ensuring reliable, governed deployment.
NewsEvaluating AI in Production: A Practical Guide
The video provides a practical guide to evaluating AI in production, emphasizing that evaluation is a continuous process, not a one-time task. It details common evaluation processes, including developing hypotheses, gathering improvement signals, defining success criteria, and utilizing various scoring methods like code-based, LLM-as-judge, and human review.
NewsEnhancing your Skills with Databricks Genie Code
Databricks Genie Code is an agentic coding system that allows users to build custom "skills" using markdown files, enabling it to generate code and perform tasks according to specific in-house standards and conventions. These skills provide context-on-demand, ensuring repeatable and consistent output for various engineering tasks like schema documentation or metric view creation.
Week of May 4
6 videos
News2026 & Beyond: Agentic Future in Finance
Databricks emphasizes that an "agentic future" in finance requires organizations to leverage their unique, proprietary data to provide context to AI models, which is the true competitive advantage. The video demonstrates how Databricks' platform centralizes and governs enterprise data, enabling AI agents to make informed, secure, and differentiated business decisions.
ReleasesIntroducing Databricks Document Intelligence
Databricks Document Intelligence is a new solution for extracting, processing, and analyzing unstructured data from documents using large language models. It offers a unified platform for document processing, including data extraction, summarization, and question answering, with a focus on accuracy and scalability.
NewsDatabricks Genie, Unity AI Gateway, Project Glasswing, and Model Mania | AI Newsround - April 2026
Databricks Genie is now the business user home screen for Databricks, offering a unified chat interface, external knowledge store connections, and a mobile app. The Unity AI Gateway, integrated with Unity Catalog, provides comprehensive governance for agentic AI, including permissions, auditing, and policy controls for models and tools.
NewsDatabricks in 3 minutes. The unified data and AI platform, explained.
Databricks unifies diverse data sources into a single data lake, providing a governed platform for analytics and AI. It offers capabilities like fine-grained access control, natural language querying with AI, and company-wide intelligent agents.
TutorialsMachine Learning Explained - END to END | Chapter 02
The video explains core machine learning concepts, including supervised, unsupervised, and reinforcement learning, along with the workflow for building and evaluating models. It details classification and regression models, their applications, and essential data preparation techniques like feature engineering and handling the curse of dimensionality.
NewsDatabricks News: watermark-based incremental ingestion, MCP in AI gateway, Genie, Vector Search
Databricks now offers watermark-based incremental ingestion from SQL databases without change data feed, allowing for efficient data updates and soft deletion handling. The AI Gateway supports custom MCP servers, enabling integration with external APIs like GitHub for enhanced AI application development.
Week of Apr 27
12 videos
NewsEasy hack to optimize Scala and Java in Databricks
Databricks now supports running Java and Scala on Serverless Jobs using JAR files, eliminating the need to learn new languages for existing workloads. Users build a JAR with matching Databricks versions, add it as a job task, configure the main class and compute, and then run it.
TutorialsStep-by-Step: Using the Databricks Excel Add-in to Analyze Governed Lakehouse Data
TutorialsHow To Build Data Apps with Databricks, Power Apps, and Power Automate
The video demonstrates how to connect Power Apps, Power Automate, and Databricks to build data-driven applications. It shows how to add a Power Automate flow to a Power App and trigger a Databricks job using a button within the app.
NewsZerobus Ingest, Lakebase and Databricks Apps in Action: Data Streaming with Databricks
The video demonstrates a real-time IoT data streaming application built with Zerobus for ingestion, Lakebase for low-latency serving, and Databricks Apps for the front and back ends. This architecture processes thousands of concurrent IoT events from mobile phone sensors globally without using Kafka or traditional complex pipelines.
NewsTalkdesk Powers AI-Driven CX with Databricks on AWS
Talkdesk uses Databricks on AWS as a unified data platform to power its AI-driven customer experience (CX) platform, which automates and accelerates customer interactions. Databricks centralizes data storage, provides consistent data modeling, and unifies data processing pipelines, enabling Talkdesk to manage both unstructured and structured data in Iceberg format and leverage generative AI capabilities.
TutorialsHow To Connect Power Apps to Databricks for Secure, Zero‑Copy Data Access
The video demonstrates how to connect Microsoft Power Apps to Azure Databricks for secure, zero-copy data access. It shows how to create a connection, load data into a Power App, and perform create, read, update, and delete operations directly on Databricks data, with auditing capabilities.
NewsFrom AI to Agents| Fundamentals of AI | ML | DL | LLM & GenAI | Chapter 01
The video explains the fundamental concepts of AI, ML, DL, LLMs, and GenAI, illustrating their hierarchical relationship as subsets of each other. It also defines what models are (mathematical formulas trained on data) and how agents combine LLMs with tools and optional memory to perform autonomous tasks.
TutorialsApache Spark Streaming Real-Time Mode - Latency Demo
The video demonstrates how to deploy and run Apache Spark Streaming in Real-Time Mode (RTM) using a declarative automation bundle. It shows that RTM significantly reduces P50 and P95 latencies compared to microbatch mode, achieving 26ms and 50ms respectively in a simplified setup without an external messaging bus.
TutorialsAir Traffic Control with Apache Spark Structured Streaming Real-Time Mode
The video demonstrates building a real-time air traffic control application using Apache Spark Structured Streaming Real-Time Mode, Lakehouse, and Databricks Apps. This system processes live flight telemetry, detects congestion, and generates alerts with sub-second end-to-end latency, all within a single Databricks platform.
ReleasesStep-by-Step: Connecting Databricks to Excel Using the Databricks Excel Add-In
The Databricks Excel add-in provides governed access to Databricks lakehouse data directly within Excel, enabling business users to query data without SQL. The video demonstrates how to self-service install the add-in by editing and uploading its manifest XML file into Excel web.
TutorialsLakebase and PG Vector: Vector Search of the Future?
The video demonstrates how to implement vector search using Lakebase and PG Vector within Databricks, focusing on two patterns: Lakebase native and reverse ETL from the lakehouse. It walks through setting up a maintenance co-pilot application that leverages PG Vector for semantic search, joins, and filtering on maintenance logs, showcasing the process from data embedding to app deployment and job scheduling for continuous updates.
NewsLovable now integrates with Databricks
Lovable now integrates with Databricks, allowing users to build data applications and tools using plain English prompts to access and write data to their Databricks Lakehouse. This connector enables rapid development of dashboards and applications while maintaining data governance and controlled access to specific catalogs, schemas, and tables.
Week of Apr 20
12 videos
ReleasesHow OpenAI and Databricks are working together
Databricks and OpenAI are partnering to help enterprises deploy and adopt AI, with Databricks focusing on secure data access and management for AI applications through products like Genie and AI Gateway. The video highlights GPT 5.5's enhanced planning capabilities and its leading performance in office knowledge work benchmarks, demonstrating its impact beyond coding to automate internal business processes.
NewsMaking AI understand your data - part 2 #databricks #data #ai
Databricks metric views allow for advanced data definitions using joins, including nested joins with runtime 17.1+, and complex calculations with windowing for time-based analysis. Materialization can precompute popular metric views with incremental updates, and semantics can be added for non-technical users using runtime 17.2+.
NewsHow Techcombank Scales AI Banking to 16M Customers with Databricks
Techcombank uses Databricks to power its AI banking platform, serving 16.2 million customers and processing 8 billion daily transactions with a 12,000-plus feature store. This enables the bank to make data-driven decisions, automate lead allocation with over 8,000 features, and achieve a 3x conversion uplift, improving both productivity and customer experience.
NewsAre You Drowning in a Sea of Data Requests? #DataAnalytics #Help
The video uses a restaurant metaphor to explain why Business Intelligence (BI) teams become overloaded. It likens IT to kitchen staff, data to ingredients, analysts to waiters, and the business to customers, highlighting the bottleneck created when too many customer requests overwhelm the limited number of analysts.
NewsGit-Style Database Branching (But Actually Fast) #database #lakebase
LakeBase enables Git-style database branching by creating metadata-only branches instead of full data copies. This allows users to create dev, QA, and prod branches that point to the main branch without duplicating the entire dataset.
CommunityFrom Notebook to Production: MLOps Quickstart
The video demonstrates how to apply MLOps best practices on Databricks using a quickstart repository, covering data ingestion, feature preprocessing, model training, deployment, and inference. It showcases Databricks tools like MLflow and Unity Catalog for managing the ML lifecycle, including version control, experiment tracking, model governance, and automated deployment across development and production environments.
TutorialsGoverned Tags & Data Classification in Databricks | ABAC Foundations
Databricks now offers governed tags and automated data classification to identify sensitive information like PII. This enables Attribute-Based Access Control (ABAC) policies for masking or hiding data based on user roles, without altering query patterns.
NewsGenAI - For Data Engineers Agenda & Introduction | LLM & Agentic AI | LangChain & LangGraph | Claude
This video introduces a new course, "GenAI for Data Engineers," designed to teach data engineers how to leverage generative AI, LLMs, and agentic AI. The course covers basics of LLMs, building agents with LangChain and LangGraph, using Cloud Code, and applying agentic AI within Databricks and data engineering workflows.
TutorialsReverse ETL: Exposing Gold Layer Data to Lakebase!
Reverse ETL allows exposing gold layer tables from a medallion architecture to Lakebase. This enables applications to read and write to these exposed tables, such as a dim customer table.
TutorialsReal-Time ML Lookups: Lakebase for Zero Latency!
Lakebase enables real-time ML lookups by syncing data from Delta tables, offering a low-latency alternative to querying large gold tables directly. This reverse ETL process allows ML models to access necessary data quickly for real-time predictions.
NewsDatabricks AI Dev Toolkit: 10x Your Development
The Databricks AI Dev Toolkit is a repository created by the field engineering team to enable MCP tools and skills for building on Databricks. It can be attached to a coding agent to accelerate development on Databricks tenfold.
NewsHow Agentic AI is Rewriting Healthcare | NVIDIA x Databricks
Agentic AI is profoundly changing healthcare by automating administrative tasks for professionals and accelerating scientific research, such as drug discovery. Databricks and NVIDIA are collaborating to build an AI-ready data layer and open-source platforms to unlock insights from digitized medical data, enabling these agentic systems.
Week of Apr 13
7 videos
NewsZerobus Ingest and Lakebase in Action: Data Streaming with Databricks
The video demonstrates a real-time IoT data streaming application built with Zerobus for ingestion, Lakebase for low-latency serving, and Databricks apps for the front and back end, without relying on Kafka. It showcases how thousands of concurrent IoT events from mobile phone sensors worldwide are ingested, processed, and visualized on a map, with traces served by Lakebase for fast access.
NewsMaking AI understand your data - part 1 #ai #data #texttosql #code #vibecoding
Databricks' MetricView helps AI understand data by defining official sources and business logic, preventing inconsistent results from direct queries. The video demonstrates creating a MetricView in Unity Catalog, which can then be used with SQL or AI text-to-SQL tools for consistent data analysis.
TutorialsEnable Storage Firewall in Databricks - Security Tutorial
This video demonstrates how to enable firewall support for an Azure Databricks workspace storage account to restrict public network access. It outlines prerequisites, guides through creating private endpoints, verifying network connectivity configurations, and finally executing a PowerShell command to enable the storage firewall.
NewsDatabricks AI Dev Toolkit: Empowering Workspace Users
The Databricks AI Dev Toolkit provides workspace users, even those unfamiliar with IDEs, access to AI tools via a Databricks app serving an MCP server. It supercharges the Genie code agent with MCP tools to automate resource creation.
NewsAsk Genie Anywhere | Bring AI/BI Genie to Microsoft Teams & M365 Copilot via Copilot Studio
Databricks' AI/BI Genie, a data analyst agent, now integrates natively with Microsoft Copilot Studio, allowing organizations to embed Genie into Microsoft Teams, M365 Copilot, and SharePoint. This enables users to ask data questions and receive insights directly within their collaboration tools, without leaving their workflow.
News10 Data Warehouse Migration Myths Blocking AI-readiness
The video debunks three myths about data warehouse migration to Databricks: the need for a massive new team, migrations being a sunk cost, and projects always blowing past deadlines. It explains that modern lakehouse architecture empowers existing teams, consolidating initiatives removes complexity, and a phased approach delivers value quickly.
NewsDatabricks Apps vs Model Serving: Authentication, Cost, and Performance Compared
Databricks Apps are now the recommended first choice for deploying agents due to their flexibility in handling full-stack applications with multiple components, offering faster iteration and local testing compared to Model Serving. Model Serving remains suitable for use cases prioritizing high QPS, governance features like AI Gateway, inference tables, and guardrails, or when scaling to zero is acceptable for cost optimization.
Week of Apr 6
23 videos
NewsGainwell Transforms Health Data with Databricks on AWS
Gainwell Technologies uses Databricks on AWS to modernize Medicaid and public health programs, enabling rapid data analysis and improved team collaboration. This platform helps drive health outcomes and lower care costs by leveraging AI to quickly process medical records for tasks like prior authorizations, reducing review times from 45 to under 10 minutes.
EventsStrategic App Expansion and the Power of Proprietary Data | Ali Ghodsi at HumanX
Databricks plans to strategically expand its SaaS application offerings, focusing on areas where proprietary data, security, and governance create a strong competitive moat. The company will prioritize applications that leverage its expertise in massive data processing.
EventsHow Databricks Manages Enterprise Data and AI | Ali Ghodsi at HumanX
Databricks centralizes an organization's data from various systems into a Lakehouse, securing it and setting access rules. This consolidated and secured data then feeds into AI agents, models, and analytics for business forecasting and insights.
EventsSolving the AI Reliability Gap | Ali Ghodsi at HumanX
AI agents currently struggle with end-to-end tasks due to a lack of context, not intelligence. Addressing this reliability gap requires capturing context and changing organizational processes, a multi-year effort that Databricks is focused on.
EventsThree Things Required for Deeper Insights from AI | Ali Ghodsi at HumanX
Databricks enables deeper AI insights by combining agents and AI with a robust database and an analytics platform. This approach allows enterprises to leverage their proprietary data for predictive analytics beyond what traditional SaaS applications offer.
EventsAI Productivity and the PC Revolution Analogy | Ali Ghodsi at HumanX
AI offers 20-30% immediate productivity gains, especially in coding, but its full potential is hindered by a lack of context. Achieving greater automation requires re-engineering entire enterprise processes, similar to how early PC users initially treated them as typewriters before fully integrating them.
EventsHow Databricks Genie is Transforming Data Analysis in Minutes | Ali Ghodsi at HumanX
Databricks Genie allows scientists to quickly query complex data, like adverse effects in obesity studies, receiving accurate, referenced answers in minutes instead of months. Businesses like EasyJet use Genie to build agents that combine real-time data on seat availability, competitive pricing, and demand to dynamically set prices, a process that previously took months.
EventsHow Novo Nordisk Uses Databricks Genie for Research | Ali Ghodsi at HumanX
Novo Nordisk utilizes Databricks Genie to enable its scientists to query data warehouses and databases. This allows researchers to ask complex questions about studies, such as adverse effects, and receive accurate, statistically referenced answers.
EventsAI Cut Exploit Time to 1.3 Days | Ali Ghodsi at RSAC 2026
AI has drastically reduced the mean time to exploit a vulnerability from over two years in 2018 to an average of 1.3 days in 2026. This acceleration, particularly since ChatGPT's release, indicates AI's role in rapidly weaponizing CVEs.
EventsManaging 32,000 Weekly Security Alerts | Ali Ghodsi at RSAC
Weekly security alerts for a reasonably sized organization are projected to increase from 7,500 in 2020 to 32,000 in 2026, requiring over 400 full-time staff to manually process. This demonstrates the unsustainability of current manual security alert management and the urgent need for automated solutions.
EventsThe Case for Open Data Architecture | Ali Ghodsi at RSAC 2026
The video advocates for an open data architecture where organizations store their data in open formats on data lakes, preferably in the cloud, to avoid vendor lock-in and control costs. This approach allows for using various tools to access and manage data, with federation technology enabling access to data in proprietary systems during a gradual migration.
EventsThe Limits of Human-Led Security Operations | Ali Ghodsi at RSAC
Current Security Information and Event Management (SIM) systems are limited by data ingestion pricing models, leading to incomplete data capture and a lack of long-term historical analysis. Furthermore, detection, investigation, and threat hunting processes within these systems are largely manual, resulting in security operations teams being overwhelmed and detecting only a fraction of potential threats.
EventsWhy Legacy SIEM Models Are Struggling | Ali Ghodsi at RSAC 2026
Legacy SIEM models struggle against AI-driven agent swarms because they rely on incomplete data, human SOC teams, and proprietary silos. This approach is unsustainable, leading to the prediction that AI will replace SIEM this year.
NewsDatabricks News: AUTO CDC, Workspace skills, Ask Genie, and Type widening
Databricks introduces Auto CDC for efficient change data feed processing, notebook and govern tags for better organization, and workspace skills for Ask Genie to customize its responses. Databricks also adds type widening for streaming tables, allowing data types to automatically adjust to larger incoming values.
EventsThe 1.3 Day Exploit: How AI is Accelerating Cyber Threats | Ali Ghodsi at RSAC 2026
AI has drastically reduced the mean time to exploit vulnerabilities from over two years in 2018 to an average of 1.3 days by 2026. This acceleration, particularly since ChatGPT's release, indicates AI is now automating cyber threat exploitation.
NewsWhy Manual Security Operations are Failing in 2026 | Ali Ghodsi at RSAC
Manual security operations are failing because the volume of weekly security alerts is projected to increase from 7,500 in 2020 to 32,000 in 2026, requiring an unsustainable 400+ full-time employees for an average organization. This exponential growth in alerts makes it impossible for human teams to process and respond effectively.
TutorialsLakebase - OLTP Workloads on Databricks!
Lakebase is a fully managed, serverless PostgreSQL offering from Databricks that decouples compute and storage, enabling independent scaling, auto-scaling to zero, and deep integration with the Databricks Lakehouse. It supports reverse ETL to bring data from the Lakehouse into Lakebase for OLTP applications and forward ETL to sync transactional data back to the Lakehouse for analytics.
TutorialsHow to Get AI Dev Tools Running in Databricks Today #tutorial #AI #coding
The video demonstrates how to enable the Databricks AI Dev Toolkit within the Databricks workspace. It addresses the challenge of setting up these AI development tools for users who prefer the Databricks workspace over a local IDE.
ReleasesDatabricks Genie Code, Carl, Bull**** Bench & more! | AI Newsround - March '26 | Advancing Analytics
The video discusses Databricks' new AI tools, Genie Code for autonomous data work and Carl for faster, cost-efficient enterprise knowledge agents using custom reinforcement learning. It also covers the Bench V2 for evaluating AI models' ability to detect and push back on nonsense, along with updates to various models like Qwen 3.5, Gemini 3.1 Flashlight, and OpenAI's GPT-5.3 Instant, 5.4, Mini, and Nano, highlighting their focus on agent capabilities and cost-efficiency.
TutorialsEasily create metric tracking tables using Spark Declarative Pipelines in Databricks
The video demonstrates how to create metric tracking tables in Databricks using Spark Declarative Pipelines. It shows how to use the create_auto_cdc_from_snapshot_flow function to automatically track changes in a materialized view over time, enabling historical analysis for dashboards.
NewsStop Guessing Table Health — Let These Dashboards Tell You
Databricks offers two dashboards for monitoring table health and access: the Table Access Advisor and the Table Health Advisor. These dashboards provide insights into table ownership, read/write patterns, staleness, optimization status, and underlying file structures, helping users identify ghost tables and ensure best practices.
TutorialsFrom Excel to AI Agents: The Evolution of BI Explained
The video explains the evolution of Business Intelligence (BI) through four phases, from IT-centric to analyst-driven, then semantic layers, and finally to a future where AI agents are primary BI users. It demonstrates how Databricks' BI stack, including Dashboards, Genie (natural language interface), Metric Views (semantic layer), and Databricks One (serving layer), addresses these evolving needs by providing a unified, open, and AI-ready platform.
TutorialsHow to Sync Lakebase Tables to Delta with Lakehouse Sync
Databricks demonstrates how to sync Lakebase PostgreSQL tables to Delta tables within a Databricks Lakehouse using the Lakehouse Sync feature. This process enables analytical workloads on data originating from Lakebase applications by leveraging Delta and Spark.
Week of Mar 30
8 videos
TutorialsYour Delta Tables Deserve a Postgres Home
Databricks demonstrates syncing Delta tables from Unity Catalog to a Postgres database within Lake Basin, enabling OLTP-style quick lookups for applications. Users can configure continuous, on-demand snapshot, or triggered sync modes, defining primary keys and grouping tables into pipelines for efficient data transfer.
ReleasesDeploy Azure Databricks in 5 Minutes — VNET Injection + NAT Gateway
The video demonstrates how to deploy an Azure Databricks workspace with VNET injection and NAT Gateway in Azure. It walks through creating the necessary virtual network and subnets, then configuring the Databricks workspace to use them for secure outbound connectivity.
NewsNever Build a Dashboard by Hand Again
The Databricks assistant, now called Genie code, can automatically generate multi-page dashboards from a blank canvas using natural language prompts. Users define a metric view as the data source and then describe desired dashboard pages, visuals, and themes, with Genie code planning and executing the build.
NewsLakebase: Postgres That Actually Likes Your Lakehouse
Lakebase is a new Databricks offering that provides a fully managed, autoscaling PostgreSQL database designed to bridge the gap between analytical and transactional workloads in a lakehouse architecture. It features bidirectional data streaming between Delta tables and PostgreSQL, database branching for isolated development, and Unity Catalog governance.
NewsSee Databricks Assistant Build a Metric View in 90 Seconds
The video demonstrates how Databricks Assistant can build a metric view in 90 seconds by generating YAML code for joins, dimensions, and measures from a natural language prompt. This metric view, a miniature semantic model, centralizes business logic and is queryable via SQL by various tools and agents.
Tutorials54 Zerobus Ingest Lakeflow Standard Connector | Ingest Streaming data directly into Delta Table
The video demonstrates how to use Databricks Zero Bus Ingest, a push-based API, to directly stream various data types like IoT, event, and telemetry data into Unity Catalog Delta tables. It highlights Zero Bus Ingest's ability to simplify streaming ingestion by eliminating the need for intermediate message buses and managing their infrastructure.
NewsDatabricks News: Excel add-in, Metrics Views UI, and Quality Monitoring
Databricks announced Lake Watch for cybersecurity, new dynamic dropdown filters in SQL editor, and improved quality monitoring with null value scanning and automated alerts. The video also demonstrates a new UI for defining metric views, an Excel add-in for data preview and import, and the ability to publish dashboards as public web pages.
NewsMaster Dimensional Modeling Lesson 03 - Understand the ETL Pipeline
The video explains the typical stages of a data warehouse ETL pipeline, including pre-staging (raw data), staging (cleaned data), operational data store (snapshot), and data mart (star schema). It also details the benefits of having multiple stages, such as easier debugging, data recovery, and auditability, and how this maps to the Medallion Architecture (Bronze, Silver, Gold).
Week of Mar 23
1 video
TutorialsDatabricks AI Dev Kit: Install for Copilot + VS Code
The video demonstrates how to install the Databricks AI Dev Kit for Visual Studio Code with GitHub Copilot on Windows, guiding users through the installation script, profile configuration, and skill selection. It then shows how to enable the Databricks tools in Copilot chat and tests its functionality by generating code and executing SQL queries against a Databricks workspace.
