Unity Catalog
Recent items mentioning Unity Catalog across the Databricks ecosystem — releases, news, videos, and community Q&A. Updated hourly.
Recent developments highlight Unity Catalog's expanding role in data governance and operational efficiency. It now underpins Databricks Genie ZeroOps for proactive issue detection and resolution via lineage 2, and provides attribute-based access control (ABAC) updates for fine-grained security 4. Additionally, Unity Catalog is crucial for modernizing analytics, as seen in its use for governing student records at the English Office for Students 5 and scaling time-series analysis with Impulse for AVL 7.
Generated daily from the 10 most recent items mentioning Unity Catalog. Click any [N] to jump to the source.
TutorialsMastering Joins In Apache Spark: Complete Deep Dive
The video provides a deep dive into four Apache Spark physical join strategies: Sort Merge Join, Broadcast Hash Join, Shuffle Hash Join, and Broadcast Nested Loop Join. For each join, it explains the conditions for Spark's selection, visualizes its step-by-step internal mechanics, and demonstrates its appearance in Spark's physical plan and UI.
EventsDatabricks Genie ZeroOps | A Quick Overview
Databricks Genie ZeroOps is an autonomous background agent that monitors data pipelines, jobs, tables, and ML models to detect and fix issues proactively. It identifies root causes using Unity Catalog lineage, generates a fix, and tests it in a secure sandbox before user approval.
Managed vs External Storage in Unity Catalog on Azure: Where Should Your Data Live?
Five Unity Catalog ABAC Updates Worth Paying Attention To
How the English Office for Students leverages Databricks to enhance higher education standards and drive better student outcomes
The English Office for Students modernized its analytics environment on Databricks to manage millions of student records and support data-informed higher education regulation. By unifying structured, qualitative, and near-live data on a governed platform with Unity Catalog and AI capabilities, they accelerated analysis, improved collaboration, and enabled faster, more trusted decision support.
Data Dictionary and Unity Catalog: Best Practices in a Medallion Architecture
From test bench to lakehouse: how AVL modernizes measurement data analytics with Impulse
AVL modernized their measurement data analytics with Impulse, an open-source Databricks Labs framework for sensor data analysis. Impulse on Databricks scales time-series analytics to hundreds of terabytes, cutting analysis time from days to minutes while ensuring reproducibility, shareability, and Unity Catalog governance.
Row and column level security on unity catalog in different approaches
What if the answer was already in your data?
Kythera Labs' AI agents, built on Databricks, now provide health system leaders with governed, trustworthy answers to strategic questions from 339 billion claims. A Louisiana health system saw 150% more visibility into patient encounters and $3.8M in estimated annualized value in 10 days.
Databricks positioned highest in execution and furthest in vision for the second consecutive year in Gartner Magic Quadrant
Databricks is recognized as a Leader in the 2026 Gartner Magic Quadrant for AI Platforms for Data Science and Machine Learning, positioned highest in execution and furthest in vision for the second consecutive year. This reflects the market shift towards deploying agentic applications that reason on governed data, enabled by Databricks' unified data, AI, and governance platform with Unity Catalog and Unity AI Gateway.
python-v1.6.1: Column Mapping write support
This release adds support for writing Delta tables with column mapping enabled. It also introduces a new API for stats-free append writes and allows switching nanosecond timestamps at runtime in Python.
Unity Catalog External Location with Amazon S3 Access Points,session policy behavior and workarounds
Genesis Workbench: A blueprint for industry AI in life sciences, powered by Databricks and NVIDIA
Genesis Workbench is a new Databricks blueprint integrating NVIDIA BioNeMo and Parabricks into a secure, no-code environment for end-to-end drug discovery. It centralizes data and eliminates external API dependencies, streamlining research from hypothesis to therapeutic candidate with Unity Catalog governance.
NewsAI-Ready Data on Databricks: How TK Elevator Uses Context and Meaning to Make AI Agents Work
TK Elevator uses Databricks' Unity Catalog to create "AI-ready data" by harmonizing data from over 100 disparate systems, enabling a common language for their AI agents. This foundation supports predictive maintenance for elevators, empowering 25,000 service technicians with tailored support and voice debriefing capabilities.
NewsUnity Catalog Fine-Grained Access Controls on External Engines
Unity Catalog enables fine-grained access controls (FGAC) defined once to be enforced consistently across Databricks and external engines like Apache Spark. External engines can also create and write to UC-managed tables, benefiting from centralized governance, automatic optimization, and transactional safety.
Unity Catalog not enabled - personal Microsoft account cannot access Account Console on Azure Databr
ReleasesAnnouncing CustomerLake: the agentic CDP embedded in Databricks
Databricks announces CustomerLake, an agentic Customer Data Platform embedded within the Databricks platform. It offers a business-first UI, leverages Unity Catalog for governance, and uses agents for autonomous, personalized customer experiences.
ReleasesAnnouncing Genie App Builder: vibe coding that understands your business #databricks
Databricks announces Genie App Builder, a new "vibe coding" tool designed to understand a business's full context, including data, deployment environment, and policies. It leverages Genie ontology for semantic understanding, is 100% Databricks managed for security, and builds production-quality apps using the App Kit SDK.
Delta Lake 4.3.0
Databricks practitioners can now integrate Spark with the Unity Catalog Delta REST API for managed Delta tables and selectively replace data using new `replaceOn` and `replaceUsing` DataFrame APIs. UniForm for Iceberg conversion is now atomic and incremental, and Delta Sharing supports streaming and Change Data Feed for shared tables.
UnityCatalog 0.5.0
This release introduces a new UC Delta API for managing Delta tables via REST, enabling various engines to use Unity Catalog as a Delta-native catalog. The UC Spark connector now has separate artifacts for Spark 4.0.x and 4.1.x compatibility, and its credential-scoped file system is enabled by default.
EventsAnnouncing Lakehouse//RT | Databricks Co-founder and Chief Architect Reynold Xin at Data + AI Summit
Lakehouse//RT is a new Databricks warehouse product, powered by the Raiden engine, designed for read-only workloads with millisecond performance and massive concurrency. It allows users to query Delta or Iceberg data directly on their data lake, governed by Unity Catalog, potentially collapsing separate serving stacks into a single platform.
SSH connection error messages are improved with server logs, and the SSH server startup timeout for GPU accelerators is increased. Authentication fallback to default profiles is fixed, and bundle variable references now support Unicode letters.
Databricks Iceberg Support Has a Catch. It's Called Unity Catalog
Your data is clean. But who's accessing it, and how? Governing your Lakehouse with Unity Catalog
What is data pipeline architecture?
Data pipeline architecture separates ingestion, transformation, storage, and serving into distinct layers, with ELT largely replacing ETL as the dominant approach. Databricks unifies batch and streaming pipelines on a single platform (Lakeflow + Delta Lake + Unity Catalog), eliminating duplicate infrastructure and governance gaps.
Announcing Apps on Databricks Marketplace
Databricks Marketplace now supports apps, allowing you to instantly access and deploy powerful data and AI applications directly within your secure, governed Unity Catalog environment. This enables providers to securely distribute and monetize proprietary data apps to thousands of Databricks customers.
What’s new with Unity Catalog at Data + AI Summit 2026
Unity AI Gateway now extends Unity Catalog's runtime governance to AI agents, models, and tools, allowing you to govern agent actions, not just data access. Glossary and Domains provide a shared, governed source of business context for both people and agents, while a single catalog and policy set ensure consistent governance across all clouds and regions.
AI governance at Data + AI Summit 2026: What’s new with Unity AI Gateway
Unity AI Gateway now offers unified cost management, governance for AI assets and interactions, and monitoring for AI activity at scale. These new capabilities, announced at Data + AI Summit 2026, extend Unity Catalog to models, agents, and services, while providing spend visibility, runtime controls, and security features.
Lakeflow: A new era of agentic data engineering
Lakeflow unifies ingestion, transformation, and orchestration under Unity Catalog, providing a single source of trusted, real-time context for agentic AI. It offers high-performance ingestion from 100+ sources, real-time streaming, visual pipeline building with Lakeflow Designer, and AI-powered authoring and operations with Genie Code and Genie ZeroOps.
Accelerate search queries with full-text search indexes on Databricks
Databricks now offers full-text search indexes in Beta, accelerating substring and keyword queries on Unity Catalog managed tables with a simple SQL statement and no application changes. Production teams are seeing over 100x speedups on petabyte-scale tables, unlocking new use cases.
Databricks announces 2026 global partner awards
Databricks announced its 2026 global partner awards, recognizing over 60 Consulting and System Integrator and ISV partners including Accenture, Deloitte, Anthropic, and NVIDIA. This year's awards emphasize AI transformation, lakehouse modernization, Unity Catalog governance, and agentic AI at enterprise scale.
NewsGenie Spaces or Genie Code? Databricks AI Explained
Databricks offers two AI assistants: Genie Spaces for business users to get data answers via natural language queries and multi-task investigations, and Genie Code for technical professionals to build dashboards, write code, generate pipelines, and debug GenAI apps. Spaces is for asking questions, while Code is for building and developing within the Databricks platform.
TutorialsDatabricks Unity Catalog: The Safe Way to Govern AI
Databricks Unity Catalog provides a single governance layer for data and AI assets, enabling discovery, classification, protection, and certification of data. It demonstrates how to use Unity Catalog for context-aware search, automated lineage tracking, tagging sensitive data with govern and non-govern tags, and applying column masking for data protection.
Talk to all your data, wherever it lives
Lakehouse Federation is now available, allowing you to query data across all sources without migration delays. Unity Catalog serves as the single source of truth for both federated and managed data, enabling secure AI workloads and natural language querying.
Unlocking semantics for AI: How Mercedes-Benz Korea built trusted “Talk to Data” at scale
Mercedes-Benz Korea built a trusted "Talk to Data" solution at scale by making 500+ KPI definitions available in an AI-ready semantic layer on Unity Catalog metric views, accelerating the transition with an automated DAX-to-Metric-View transpiler. This governed semantic layer supports both existing BI and new "Talk to Data" experiences, with Genie and Agent Bricks providing consistent answers and shaping a playbook for persona-based AI agents across markets.
How ERGO Hestia reduced time-to-market with Lakebase and Mosaic AI Model Serving
ERGO Hestia modernized its real-time pricing engine with Databricks Lakebase and Mosaic AI Model Serving, reducing time-to-market by unifying data, features, and decisions for millisecond pricing. This eliminated extraction overhead and fragmented governance from their previous multi-hop architecture, enabling faster model deployment and instant market response.
NewsDeploying Azure Databricks with Terraform? Watch this first!
This video demonstrates how to deploy an Azure Databricks workspace using Terraform by cloning a provided script, configuring variables, and executing Terraform commands. It walks through setting up prerequisites, authenticating Azure CLI, and populating a Terraform variables file to successfully provision the workspace.
How Ecolab rebuilt retail intelligence on Databricks and Anthropic Claude
Ecolab rebuilt retail intelligence on Databricks and Anthropic Claude, converting 700-page FDA manuals into real-time answers for frontline staff using Foundation Model APIs and cutting compliance report compilation from two weeks to under two minutes. The solution, a native Databricks App with Lakebase Postgres and Unity Catalog, unifies nine siloed data sources and employs a multi-agent orchestration framework with Judge LLMs and MLflow tracing for personalized, continuously refined intelligence.
Announcing the Public Preview of Custom URLs
Databricks accounts can now use a single, branded custom URL like mycompany.databricks.com, replacing individual workspace URLs. This simplifies login and navigation across multiple workspaces, enabling account-wide features like Genie and Unity Catalog lineage.
NewsDatabricks on Databricks: How Marketers Use Data 3x More with Genie, an AI Analytics Assistant
Databricks built "Marge," an AI analytics assistant powered by their Genie platform, to help its marketing team access and utilize data more efficiently. Marge provides conversational analytics by unifying marketing data in a lakehouse and offering governed, trusted insights in seconds, significantly reducing reliance on manual analyst reports.
Transforming solar and wind maintenance reports with Genie and AI agents
Plenitude now converts unstructured solar and wind maintenance PDFs into a unified, queryable data model using Databricks Genie and AI agents. This enables natural-language querying and visualizations across plants, accelerating multi-plant analysis and laying the groundwork for predictive maintenance.
Solving the "Untitled" Lineage Mystery in Unity Catalog
Release: v2.11.0 (#1902)
This release introduces a Unity Catalog explorer and a workspace filesystem explorer, enhancing navigation and management within Databricks. It also adds support for SPOG host URLs.
TutorialsTrace Any AI Agent with OTel, MLflow, and Unity Catalog
Databricks now allows sending OpenTelemetry traces from any AI agent to Unity Catalog, enabling end-to-end observability and governance within the Databricks Lakehouse. This integration facilitates cost-effective trace storage, offline analytics, production monitoring, and continuous agent evaluation using MLflow.
Bring Databricks into Kiro IDE with the AI Dev Kit Power
The Databricks AI Dev Kit Power now offers a one-click setup to integrate Kiro IDE with the full Databricks platform, providing AI-assisted development grounded in your workspace's Unity Catalog metadata. This new path, alongside a lighter PAT-based option, ensures your AI assistant writes SQL with actual columns and respects all row, column, and tag-based grants.
Databricks Introduces Column Popularity in Unity Catalog: A Smarter Way to Understand Data Usage
Beyond parsing X12: Closing the gap for revenue cycle workflows in healthcare
Healthcare billers now have an operational workbench built on Unity Catalog gold views, providing a purpose-built UI with a denials queue, remittance drawer, and timely-filing age alerts directly on their fully parsed 835/834/837 EDI data. This solution integrates GenAI via Databricks Foundation Model APIs to auto-draft appeal letters, moving billers beyond manual spreadsheet and SQL work to review and approve instead of writing from scratch.
Introducing Cross-Engine ABAC
Unity Catalog now enforces attribute-based access controls (ABAC) on external engines, allowing you to define tag-based row filters and column masks once for enforcement from any engine. This centralized governance at the catalog layer, built on Iceberg REST Catalog scan APIs, ensures policies are enforced before data reaches the engine.
Governance RiskOps Agent for Unity Catalog
Power BI - DBX Unity catalog Lineage Syncer
Unity Catalog and the next era of Apache Iceberg
Sync Tables: Unity Catalog to Lakebase - Materialized Views triggered mode
Advancing Apache Iceberg on Databricks: Iceberg v3 GA, Open Sharing, and Unified Governance
Unity Catalog now offers GA support for Managed Iceberg, Iceberg v3, and Foreign Iceberg, making it the most comprehensive and production-ready Apache Iceberg catalog with open APIs, catalog federation, and secure sharing. Future versions of Iceberg and Delta will converge on a unified metadata structure, eliminating the tradeoff between interoperability and performance.
TutorialsThe New Databricks Lakeflow Designer Is a Game Changer!
Databricks Lakeflow Designer is a visual data preparation tool that allows users to create, add, and transform data using a no-code drag-and-drop UI or AI-powered Genie Code. The video demonstrates how to import data from various sources, profile data, perform complex transformations like data type conversions and sentiment analysis, and then deploy the resulting production-ready PySpark code for scheduling or integration into existing pipelines.
Define KPIs Once with Unity Catalog Business Semantics
BI Serving Pointers; Maximizing for Performance and TCO
Databricks now offers Unity Catalog Metric Views for a headless semantic layer, enabling governed business metrics across all BI tools and AI agents. Maximize performance and TCO by structuring your physical layer with star schemas, liquid clustering, and Predictive Optimization, and leverage aggregate-aware materialization for OLAP-style performance.
Snowflake to Databricks Migration in 12 weeks and cut cost per run by ~77%. AMA.
Lovelytics wrapped up a Snowflake-to-Databricks migration; 847 DBT models, 35 Info Mart tables, \~77% lower cost per run on a 2XL warehouse. **TL;DR What helped:** * Treated the migration as engineering, not translation. Each dbt model was tested in isolation, not just row counts vs Snowflake. * Routing macro to resolve cross-layer references at runtime, so the same codebase could read from Snowflake, federated Snowflake, and Unity Catalog without forking logic. * Dual model trees in one repo, which let the migration stay in lockstep with live Snowflake changes. * Script-generated wave selectors enabled parallel builds while preserving dependency order. * Used reference-slice validation subsets vs. waiting on full mart refreshes. **TL;DR Cost reduction:** * Reworked joins to use narrow staging dimensions instead of wide marts where possible. * Added incremental predicates to reduce MERGE target scans. * Split wide models into parallel sub-models where the dependency graph allowed it. * Copied static reference data into Delta instead of repeatedly reading it through federation. * Loaded static copies into Delta rather than reading via federation (predicate pushdown is poor). Happy to go into the gotchas: HASH() not being portable, Snowflake MERGE tolerating duplicate keys that Delta doesn't, NULL ordering, and timestamp handling. AMA [Full Blog Post](https://community.databricks.com/t5/technical-blog/partner-blog-847-models-12-weeks-77-less-inside-r1-s-snowflake/ba-p/157284)
Announcing Lakebase Change Data Feed (CDF)
Lakebase Change Data Feed (CDF) is now in Public Preview, eliminating pipeline sprawl from operational databases by exposing every table's changes through Unity Catalog Managed Tables. This enables native CDC governed end-to-end without sidecar infrastructure, allowing operational data to function as the native Bronze layer in the medallion architecture.
Document Intelligence on Databricks
80% of enterprise data is locked inside PDFs, scans, emails and contracts and most teams still treat it as someone else's problem. Document Intelligence on Databricks changes that. One SQL function (ai\_parse\_document), governed by Unity Catalog, integrated with Lakeflow for ingestion, Agent Bricks for structured extraction, and Vector Search for RAG. No stitched-together OCR vendors, no brittle Python glue, no separate platform to govern. I put together with [Archika Dogra](https://www.linkedin.com/in/archikadogra/) a walkthrough showing how it actually works end-to-end from a folder of raw PDFs to queryable Delta tables and downstream agents. ▶️ [https://youtu.be/sdG73gI143c](https://youtu.be/sdG73gI143c) Curious to hear what use cases you're tackling invoices, contracts, claims, technical docs? Drop them in the comments.
