News from the Databricks ecosystem.
Posts from databricks.com, MLflow, and dbt Labs — three trusted sources covering the platform, the open-source projects around it, and the data tooling layer most teams pair with it. Summarized for scanning.
This week
1 articleSkip the learning curve: rethinking data migration for real outcomes
Rethink your data migration strategy to accelerate value delivery and reduce costs. Learn how a parallel approach, AI-driven automation, and progressive decommissioning can overcome common migration challenges.
Last week
27 articlesIntroducing Omnigent: A Meta-Harness to Combine, Control and Share Your Agents
Omnigent, an open source meta-harness, is now available to combine, control, and share your AI agents across various models and interfaces. It enables building agent teams, controlling them with policies, and sharing live sessions with teammates.
From Wall Street to Data Platforms
Databricks values deep industry expertise, as shown by Kim Hatton’s transition from finance to helping financial institutions solve modern data challenges. Our collaborative environment encourages employees to grow beyond their core roles and contribute to industry innovation, building practical tools that turn complex data tasks into streamlined successes.
Enabling Evolutionary Database Development: Database branching with Lakebase, the conclusion
Lakebase now supports database branching, enabling evolutionary database development. This concludes the series on Lakebase's operationalization of evolutionary database design.
Talk to all your data, wherever it lives
Lakehouse Federation is now available, allowing you to query data across all sources without migration delays. Unity Catalog serves as the single source of truth for both federated and managed data, enabling secure AI workloads and natural language querying.
What is customer segmentation?
Customer segmentation combines multiple types and methods, from rule-based to AI/ML-driven models, but its success hinges on unifying fragmented customer data into a governed Customer 360. Databricks' CustomerLake, an Agentic CDP, builds segments directly on governed data with AI-driven identity resolution and natural-language audience creation, eliminating data copies and extra vendors.
Unlocking semantics for AI: How Mercedes-Benz Korea built trusted “Talk to Data” at scale
Mercedes-Benz Korea built a trusted "Talk to Data" solution at scale by making 500+ KPI definitions available in an AI-ready semantic layer on Unity Catalog metric views, accelerating the transition with an automated DAX-to-Metric-View transpiler. This governed semantic layer supports both existing BI and new "Talk to Data" experiences, with Genie and Agent Bricks providing consistent answers and shaping a playbook for persona-based AI agents across markets.
Forward Deployed Engineering: Delivering Business Outcomes with AI
Databricks is launching its Forward Deployed Engineering (FDE) organization to accelerate customer business outcomes with AI, pairing the Lakehouse platform with embedded, engineering-led delivery. This new approach moves beyond migration and pipeline building to solve business problems with production AI agents, as demonstrated by customers like Fox, JPMC, and Qualcomm.
Ingesting the Milky Way: Petabyte-Scale with Zerobus Ingest
Zerobus Ingest, a new serverless streaming API, enables instant deployment of petabyte-scale data pipelines on Databricks without manual infrastructure management. Its dynamic partitioning architecture automatically scales compute and sustains over 12 GB/s throughput to a single table, efficiently handling unpredictable data volumes.
How ERGO Hestia reduced time-to-market with Lakebase and Mosaic AI Model Serving
ERGO Hestia modernized its real-time pricing engine with Databricks Lakebase and Mosaic AI Model Serving, reducing time-to-market by unifying data, features, and decisions for millisecond pricing. This eliminated extraction overhead and fragmented governance from their previous multi-hop architecture, enabling faster model deployment and instant market response.
Welcoming the first cohort of Databricks student fellows
Databricks launched its inaugural Student Fellows cohort, selecting a diverse group of students to bridge academic theory and real-world data and AI practice. These fellows will host workshops, hackathons, and mentorship programs at their universities, with five standout individuals from top schools already making significant contributions.
Geospatial Unbounded: Spatial SQL GA with AI/BI Maps, Delta Sharing, and Iceberg v3
Spatial SQL is now Generally Available on Databricks, bringing native geospatial data types, 90+ ST_* functions, and AI/BI Dashboards that render maps natively. This release also includes major performance improvements, open lakehouse support via Delta Sharing and Iceberg v3, and Apache Spark 4.2 compatibility for geo columns.
Azure Databricks at Data + AI Summit 2026 featuring Industry Leaders and Partners
Azure Databricks at Data + AI Summit 2026 featured new joint product announcements and integrations, alongside key sessions on zero-copy federated analytics and ecosystem co-engineering. Learn how joint customers are modernizing data estates, scaling AI, and unlocking business value with Azure Databricks.
Empower your healthcare agents with ready-to-use MCP on Databricks Marketplace
Databricks Marketplace now offers ready-to-use biomedical and clinical Model Context Protocol (MCP) servers from partners like Climb and Atropos Health, empowering healthcare agents. Easily build and deploy bespoke agents to production, leveraging a securely governed, centralized MCP Catalog that also supports your own custom MCP servers or data.
How Ecolab rebuilt retail intelligence on Databricks and Anthropic Claude
Ecolab rebuilt retail intelligence on Databricks and Anthropic Claude, converting 700-page FDA manuals into real-time answers for frontline staff using Foundation Model APIs and cutting compliance report compilation from two weeks to under two minutes. The solution, a native Databricks App with Lakebase Postgres and Unity Catalog, unifies nine siloed data sources and employs a multi-agent orchestration framework with Judge LLMs and MLflow tracing for personalized, continuously refined intelligence.
Stop building data products. Start building data services.
The one-product-per-use-case model breaks down under acquisition-driven growth and agentic consumption; a services layer is more adaptable to what comes next. Moving data mastering and quality checks closer to ingestion makes integration cycles measured in weeks possible, reducing insight lag.
Scaling AI Through Data Fluency
Aer Lingus built a solid data foundation with governance and quality, treating data literacy as a core business skill with a custom curriculum. This enabled real-time insights for optimizing flight loads, pricing, and operations decision-making.
AWS and Databricks at Data + AI Summit 2026: Accelerating real-world AI innovation
AWS and Databricks are accelerating real-world AI innovation, from Mastercard experimentation to production-scale AI, as showcased at Data + AI Summit 2026. Explore breakout sessions, industry forums, and hands-on demos covering agentic AI, governance, open data architectures, and multi-engine interoperability with Amazon Bedrock and Kiro.
Announcing the Public Preview of Custom URLs
Databricks accounts can now use a single, branded custom URL like mycompany.databricks.com, replacing individual workspace URLs. This simplifies login and navigation across multiple workspaces, enabling account-wide features like Genie and Unity Catalog lineage.
AI Serving Platform That Adapts to Your Model
Databricks now offers a fully managed AI serving platform that automatically adapts to your model's resource needs, from scikit-learn to 70B LLMs, without manual configuration. This results in up to 90% lower infrastructure costs and <10ms p99 latency overhead for customers migrating from self-managed stacks.
Announcing the Databricks storage ecosystem: Governing the enterprise data estate, wherever it lives
The Databricks Storage Ecosystem now natively connects hybrid and on-premises storage platforms to Databricks via OpenSharing, enabling centralized data governance and GenAI scaling across your entire hybrid infrastructure. Run Databricks Serverless Compute, Genie, and LLMs directly on your on-premises datasets with a zero-copy architecture, instantly turning isolated data into active, AI-ready assets.
Modern BSA/AML compliance on Databricks
Databricks now offers a unified, AI agent and ML-augmented experience for BSA/AML compliance, consolidating siloed systems and accelerating SAR report building. AML teams can expect 8-10x faster case processing, a 75% reduction in false positives, and $50-150 million in annual cost savings.
Claude Fable 5 is now available on Databricks, fully governed through Unity AI Gateway
Claude Fable 5 is now available on Databricks, accessible through Unity AI Gateway for centralized governance, cost controls, and observability. This Anthropic model offers state-of-the-art performance across enterprise workflow automation, agentic search, data reasoning, and multimodal document understanding.
Announcing the winners of the 2026 Databricks Customer Awards
The 2026 Databricks Customer Awards winners have been announced, recognizing 10 customers for excellence, innovation, transformation, social impact, and leadership. These winners span diverse industries and regions, showcasing how they leverage Databricks to solve complex data and AI challenges.
Announcing the 2026 Databricks Customer Awards Industry winners
The 2026 Databricks Customer Awards Industry winners have been announced, recognizing ten organizations across diverse sectors like financial services, healthcare, and manufacturing. These winners showcase compelling data and AI stories, demonstrating how they've leveraged Databricks to solve complex challenges and achieve measurable results.
Transforming solar and wind maintenance reports with Genie and AI agents
Plenitude now converts unstructured solar and wind maintenance PDFs into a unified, queryable data model using Databricks Genie and AI agents. This enables natural-language querying and visualizations across plants, accelerating multi-plant analysis and laying the groundwork for predictive maintenance.
Your AI isn't broken. Your data model is.
Databricks practitioners, your AI isn't broken; your data model is. The gap between successful AI proof of concepts and failed production deployments stems from your data model, not your AI model.
Enterprise Data Strategy Roadmap for Business Outcomes
* A robust enterprise data strategy connects organizational data assets to specific business objectives through governance, architecture, and analytics frameworks that scale with evolving business needs. * Effective data governance, data quality management, and master data manage
Week of Jun 1
21 articlesEnabling Evolutionary Database Development: database branching with Lakebase, continued
This series revisits the methodolgy of Evolutionary Database Design, twenty years...
Data + AI Summit 2026: Insider’s Guide for Financial Services Leaders
Data + AI Summit 2026 offers a financial services executive guide to key banking, insurance, payments, and capital markets sessions. Learn how leading organizations like Morgan Stanley and JPMorganChase are approaching AI transformation, responsible AI, and operational modernization, with practical strategies for maximizing summit value.
Your guide to the Telecommunications Industry Experience at Data and AI Summit 2026
The Data + AI Summit 2026 will feature a Telecommunications Industry Experience, showcasing how global carriers are leveraging data and AI to address customer experience, network operations, and fraud. Attendees will gain insights into AI agents for autonomous networks, churn prevention, and Genie-powered conversational intelligence for frontline teams.
3x Faster Search: Parallel Test-Time Scaling with Instructed-Retriever-1
Instructed-Retriever-1 now delivers 3x faster search for Agent Bricks Knowledge Assistant. This parallel test-time scaling update also improves quality for Databricks practitioners.
Apache Spark Real-Time Mode for Gaming: A Better Way to Do Real-Time Sessionization
Apache Spark Real-Time Mode now enables real-time gaming sessionization for millions of active device sessions, replacing custom applications with sub-second precision for both input processing and timer-driven output. Learn how transformWithState timers power proactive, timer-driven heartbeats, generating output on a schedule independent of incoming data.
Bring Databricks into Kiro IDE with the AI Dev Kit Power
The Databricks AI Dev Kit Power now offers a one-click setup to integrate Kiro IDE with the full Databricks platform, providing AI-assisted development grounded in your workspace's Unity Catalog metadata. This new path, alongside a lighter PAT-based option, ensures your AI assistant writes SQL with actual columns and respects all row, column, and tag-based grants.
Building a data stack for trusted AI
Databricks now offers a data stack for trusted AI, providing governed, consistent, and contextual data. Learn how to build it without tying yourself down.
Scaling Enterprise Conversational Intelligence: Cross-industry Technology and Functional Solutions Powered by Databricks Genie
Databricks Genie now powers cross-industry conversational intelligence solutions from leading partners, offering ready-to-deploy offerings for sales, marketing, HR, finance, and other enterprise functions. These innovative solutions accelerate AI transformation by addressing technology and function-specific use cases across the enterprise.
Beyond parsing X12: Closing the gap for revenue cycle workflows in healthcare
Healthcare billers now have an operational workbench built on Unity Catalog gold views, providing a purpose-built UI with a denials queue, remittance drawer, and timely-filing age alerts directly on their fully parsed 835/834/837 EDI data. This solution integrates GenAI via Databricks Foundation Model APIs to auto-draft appeal letters, moving billers beyond manual spreadsheet and SQL work to review and approve instead of writing from scratch.
dbt Labs Named Snowflake Data Integration Product Partner of the Year
dbt Labs was named Snowflake Data Integration Product Partner of the Year. This post details dbt Labs' two Snowflake Partner honors, including the CoCo Adoption Award.
Agentic BI: A Practical Guide for BI Teams and Business Users
Agentic BI, which embeds autonomous AI agents into analytics workflows, automates data prep, query execution, and insight delivery to replace static dashboards and address dissatisfaction with current insight generation. A governed semantic layer is critical for trustworthy agentic analytics, and adoption can be incremental, starting with a pilot and expanding based on documented outcomes.
Data Science vs Data Analytics: Compare Careers, Skills, and Degrees
Data analytics explains what already happened using SQL and Power BI, while data science builds ML models to automate future decisions. Choosing between them depends on your appetite for technical depth, comfort with unstructured data, and preference for stakeholder communication vs. system deployment.
AI in Defense: How Artificial Intelligence Is Reshaping National Security
AI is rapidly reshaping national security as nations accelerate military AI development, creating a global race with strategic consequences. Responsible AI governance, model validation, and human oversight are essential safeguards as defense organizations deploy autonomous systems and machine learning in combat operations.
Data Governance Architecture: A Complete Blueprint for Modern Organizations
This blueprint details a complete data governance architecture, outlining the policies, roles, and technologies needed to manage data assets. It emphasizes a modern strategy combining automated lineage, RBAC, and federated models to ensure data quality and regulatory compliance at scale.
Query Tags: The Context Your Warehouse Queries Have Been Missing
Databricks SQL warehouses now support query tags, enabling cost attribution by team or project and automatic tagging for dbt, PowerBI, and Tableau. Tag queries from any source, including the SQL Editor, Notebooks, Dashboards, APIs, connectors, and drivers.
Introducing Cross-Engine ABAC
Unity Catalog now enforces attribute-based access controls (ABAC) on external engines, allowing you to define tag-based row filters and column masks once for enforcement from any engine. This centralized governance at the catalog layer, built on Iceberg REST Catalog scan APIs, ensures policies are enforced before data reaches the engine.
Personalizing Genie Code with instructions, skills, memory, and MCP
Genie Code now personalizes to your conventions with Instructions, Skills, and MCP Servers, allowing reuse of team workflows, internal docs, and external tools without repeated pasting. Leverage personal skills for individual work, workspace skills for shared team workflows, and admin-approved MCP servers for scalable external context in agent mode.
Debunking 8 data layout myths: why Liquid Clustering outperforms partitioning
Liquid Clustering is the data layout for open table formats that outperforms partitioning, and this post debunks 8 common myths keeping teams tied to partitioning. Customers using Liquid Clustering report dramatic improvements in query latency, write throughput, storage efficiency, and data freshness, with the largest gains compounding at petabyte scale.
What we announced at Snowflake Summit and why it matters
dbt State, dbt Wizard, dbt Core v2.0, and the Fivetran merger
Fivetran + dbt Labs Complete Merger to Create the Data Infrastructure for Trusted AI Agents
Fivetran and dbt Labs have completed their merger, creating a unified company focused on building the data infrastructure for trusted AI agents. This new entity aims to provide the foundational data layer necessary for the agentic AI era.
Week of May 25
14 articlesEnabling Evolutionary Database Development: database branching with Lakebase
Why this series existsThe methodology described in Evolutionary Database Design and...
AI Doesn't Scale Until You Stop Calling It Innovation
Schneider Electric solutions leveraging Databricks can reduce energy costs by up to 20 percent, demonstrating that scaling AI requires focusing on business value and customer need over technology selection. The fastest-scaling companies combine domain expertise with AI knowledge through dedicated, end-to-end teams.
Databricks at SIGMOD 2026
Spark Declarative Pipelines (SDP) are simplifying complex ETL and streaming workloads, pioneering the next generation of data engineering. Get a deep dive into Enzyme, our incremental view maintenance engine, which won an honorable mention at SIGMOD.
Winning under CMS TEAM: Building the learning health system to realize success in VBC today and tomorrow
Databricks helps healthcare providers succeed under the mandatory CMS TEAM program by building an AI-enabled data foundation for proactive, data-driven intervention. This enables a unified view across clinical and claims data, embedding predictive insights into care workflows to reduce SNF costs by 15% and readmissions by 12%.
How enterprise leaders are scaling AI agents across their organization
Databricks practitioners can learn five key practices for scaling agentic AI responsibly across enterprise core workflows like HR, finance, and fraud detection. This post helps leaders deliver rapid gains from AI agents while maintaining governance, trust, and cost control.
Advancing Apache Iceberg on Databricks: Iceberg v3 GA, Open Sharing, and Unified Governance
Unity Catalog now offers GA support for Managed Iceberg, Iceberg v3, and Foreign Iceberg, making it the most comprehensive and production-ready Apache Iceberg catalog with open APIs, catalog federation, and secure sharing. Future versions of Iceberg and Delta will converge on a unified metadata structure, eliminating the tradeoff between interoperability and performance.
Reliable LLM Inference at Scale
Databricks now offers model units, a VM-like abstraction for allocating and scaling GPU resources per customer, enabling cost-aware load balancing and autoscaling that saved over 80% in GPU costs. Runtime reliability mechanisms like black-box health checks and multimodal bottleneck profiling further improve throughput and recover from silent failures automatically.
BI Serving Pointers; Maximizing for Performance and TCO
Databricks now offers Unity Catalog Metric Views for a headless semantic layer, enabling governed business metrics across all BI tools and AI agents. Maximize performance and TCO by structuring your physical layer with star schemas, liquid clustering, and Predictive Optimization, and leverage aggregate-aware materialization for OLAP-style performance.
Introducing Always-On pricing: automatic savings for Databricks Lakebase
Databricks Lakebase now offers Always-On pricing, providing serverless flexibility with a 25% lower price on baseline capacity for established production workloads. Activate with a single toggle to disable scale-to-zero and set an autoscaling range, then after 24 hours of continuous use, baseline capacity bills at the Always-On rate while spikes bill at standard Autoscaling rates.
How the lakebase architecture stays resilient to cloud failures
Lakebase's architecture is built for resilience to cloud failures, not patched for it, by using stateless Postgres compute on zone-redundant storage and separating hot-path control-plane operations. This approach, validated through chaos testing and per-database availability tracking, addresses the unique reliability demands of agent workloads that start tens of millions of databases daily.
Announcing Lakebase Change Data Feed (CDF)
Lakebase Change Data Feed (CDF) is now in Public Preview, eliminating pipeline sprawl from operational databases by exposing every table's changes through Unity Catalog Managed Tables. This enables native CDC governed end-to-end without sidecar infrastructure, allowing operational data to function as the native Bronze layer in the medallion architecture.
Building a FHIR-native health data platform on Databricks Lakebase
Health Samurai's Aidbox now runs natively on Databricks Lakebase, providing a FHIR-native health data platform that standardizes clinical data at ingestion and makes it instantly available for Spark, ML, and AI. This architecture inherently delivers compliance with CMS-0057 and ONC mandates, eliminating the need for separate compliance workstreams.
AI readiness in telecommunications
Telco AI initiatives stall at production scale due to data debt, not model quality; Databricks Unity Catalog provides the semantic layer and governance needed to bridge this gap. It unifies disparate systems via Lakehouse Federation, offering AI agents rich context and enabling end-to-end governance for regulatory compliance and accurate operational tasks.
Week of May 18
37 articlesPharma launch analytics: How to compress the first 90 days and win the three years that follow
Databricks Genie for Commercial Launch Intelligence helps pharma companies compress 90 days of launch analytics into immediate insights. This enables commercial leaders to quickly interrogate launch data, make weekly decisions, and drive long-term growth.
Scaling for MHHS: how Octopus Energy achieved a 50x cost reduction in margin data engineering
Octopus Energy achieved a 50x cost reduction in their margin data engineering pipelines by re-architecting on Databricks for UK MHHS regulation. They leveraged Delta Lake Change Data Feed and Databricks Serverless to process 48x more data at a fraction of the original cost, improving freshness from weekly to daily.
Accelerating LLM Inference with Prompt Caching for Open‑Source Models on Databricks
Databricks now supports prompt caching for open-source models across all workloads, automatically accelerating LLM inference by reusing repeated prompt prefixes. This feature boosts throughput by 2.5x and reduces P50 latency by 3x for models like GPT-OSS, with no setup required.
Observability for any agent, anywhere: Production-ready tracing with OpenTelemetry & Unity Catalog on Databricks
Databricks now supports writing OpenTelemetry traces directly to Unity Catalog tables via a fully managed, serverless ingestion path. This enables governed, analytics-ready observability data with long-term retention and unified evaluation workflows, without operating OTel infrastructure.
How World Bank Group uses databricks to eradicate poverty through shared knowledge
The World Bank Group built a unified data and AI platform on Databricks, leveraging Unity Catalog, Volumes, Genie, and AI Gateway to connect structured operational data with unstructured documents. This eliminated manual research bottlenecks, enabled natural language access to trusted insights, and now supports millions of document downloads monthly to accelerate global knowledge sharing for poverty reduction.
How Databricks Genie democratizes data access in financial services
Databricks Genie now democratizes data access for financial services business leaders by enabling natural language querying of governed data. This eliminates the "Last Mile of Data Democratization" by removing the need for SQL skills or BI tool training.
Using observability data to prevent incidents
Databricks Genie enables natural language querying of telemetry data, allowing leaders to proactively identify risks and shift from reactive firefighting to reliability intelligence. This helps engineering teams move beyond optimizing for response time (MTTR) to prevent incidents by surfacing upstream reliability risks that impact revenue, roadmap velocity, and customer trust.
How security teams can report cyber risk to boards
Databricks Genie now enables real-time, data-driven cyber risk quantification, linking security posture to business impact for better governance. This helps security teams translate technical signals into financial risk insights, providing boards with unified visibility beyond fragmented legacy tools.
Transforming industries with conversational AI: Partner solutions built on Databricks Genie
Databricks Genie now powers innovative, industry-specific conversational AI solutions from leading consulting and SI partners. These ready-to-deploy offerings accelerate enterprise AI transformation across financial services, healthcare, retail, and other key sectors.
From emissions reporting to decarbonization decisions
Databricks Genie for Decarbonization Intelligence is now available to help sustainability leaders move beyond backward-looking emissions reporting. It enables natural language querying across operational and emissions data for instant answers, transforming sustainability from compliance to competitive advantage.
You’ve built the media products, now make them personalized
Databricks Genie for Digital Product Intelligence is here to help media companies personalize their products and close the "Digital Product Intelligence Gap." Learn how to use real-time behavioral data to drive 4x engagement lifts and empower product teams with instant answers to complex questions.
From "What Happened?" to "What Will Happen?"
Conversational BI now delivers predictive answers in seconds, not days, by fusing Genie for dynamic feature engineering with TabPFN for zero-training prediction, orchestrated by Agent Bricks. This self-assembling pipeline eliminates data science bottlenecks for business users, providing a governed experience backed by Unity Catalog and MLflow.
Are you ready for the dbt Fusion engine?
The dbt Fusion engine is here, and this post helps Databricks practitioners assess their readiness for migration. Learn how Brooklyn Data’s Fusion Readiness Assessment can help your team plan a confident transition.
Get dbt certified. Stay certified. Stay ahead.
Learn why dbt certification matters for your career and how to earn it. This post also covers how to stay certified and ahead.
Unlock seamless and cost-effective marketing campaigns with Lakebase
Lakebase Postgres, a serverless OLTP database, now scales to zero between marketing campaign spikes, eliminating underutilized database costs for personalization workloads. Native Synced Tables remove Lakehouse-to-OLTP pipeline burdens, letting marketing teams ship new customer segments to platforms like SAP Engagement Cloud in just a few clicks.
Governing AI agents at scale with Unity Catalog
Unity Catalog now governs AI agents at scale, providing a unified layer for identity-aware access, runtime policies, and full auditability across all agent interactions. This extends data governance to AI systems, improving observability, compliance, and trust for models, servers, and data within the lakehouse.
How telecom CFOs can make smarter network capex decisions with AI
Databricks Genie for Telecom Financial Intelligence helps telecom CFOs make smarter network capex decisions by bridging the "Financial Intelligence Gap" in capital planning. It enables finance leaders to query across financial, network, and commercial data using natural language to instantly probe the actual ROI history of comparable network investments.
How Databricks Genie improves retail personalization
Databricks Genie for Customer Intelligence now enables CX leaders to conversationally query their full customer data environment for instant answers on segment behavior, churn risk, and and loyalty program performance. This direct access to insights helps retailers turn data into actionable personalization quickly enough to act on it, addressing a key "data access opportunity."
Databricks for Good and Virtue Foundation: Partnering to Connect Medical Volunteers to Critical Health Services in 72 Countries
Databricks for Good partnered with Virtue Foundation to use AI for matching medical volunteers to critical health services in 72 countries. This collaboration provides updated core datasets in an accessible format, enabling better clinician skill-to-opportunity matching in developing countries.
AI-ready data in practice: What dbt Semantic Layer and dbt's MCP server and agent skills do for your team
dbt's Semantic Layer, MCP server, and agent skills now provide AI with essential business context. This enables your team to move beyond just clean data to truly AI-ready data in practice.
Automate Data & KPI Monitoring with SQL Alerts
In many organizations, data monitoring is still a manual, repetitive routine: open...
How to Build Real-Time Fraud Detection using Spark Real-Time Mode and Lakebase
Build real-time fraud detection with sub-second intervention using Spark Real-Time Mode and Lakebase. This unified platform processes high-throughput data streams, executes low-latency ML models, and serves explainable fraud scores to reduce detection lag and operational complexity.
How Databricks Genie improves supply chain visibility with real-time AI analytics
Databricks Genie now offers a conversational AI layer for real-time production intelligence, giving manufacturing VPs of Operations direct access to unified operational data. This eliminates data access bottlenecks, allowing leaders to ask context-aware questions and accelerate decision-making without complex analyst requests or BI tool training.
A CFO’s guide to managing value-based care financial performance
Databricks Genie for Value-Based Financial Intelligence helps CFOs bridge the VBC Financial Intelligence Gap by enabling conversational querying of integrated clinical-financial data. This allows healthcare finance leaders to quickly monitor patient utilization, track per-member-per-month costs, and identify clinical outliers essential for managing value-based care financial performance.
Stop rogue AI: How Unity Catalog secures your agent actions
The risks of agentic AI are no longer theoretical. Agents connected to external tools...
Why AI Security Infrastructure is Now a CMO Priority
Databricks launched Lakewatch, an open, agentic SIEM built on the Lakehouse, to bring security detection to where enterprise data already lives. CMOs must now prioritize AI security infrastructure and engage in enterprise data and platform decisions to scale AI safely, as marketing teams are increasingly targeted by agentic cyberattacks.
Databricks context engineer associate: the industry’s first certification for reliable AI agent systems
Databricks launched the industry's first certification for reliable AI agent systems: the Databricks Context Engineer Associate. This new certification validates skills in designing, managing, and governing AI context effectively.
What's shipped in dbt — May 2026
May 2026 brings a roundup of dbt shipments since January, covering agents, Fusion, security, developer experience, dbt Core, and more. This post details all the product changes relevant to your Databricks workflows.
Introducing AI spend controls with Unity AI Gateway
Unity AI Gateway now includes AI Spend Controls, offering proactive budget alerts across users, workspaces, and accounts to monitor and contain AI costs. These controls integrate with Unity Catalog system tables and Databricks budgets for unified governance of AI usage, cost visibility, and operational accountability.
How to safeguard AI workloads with Unity AI Gateway Guardrails
Unity AI Gateway now ships with pre-built and custom guardrails to protect sensitive information and ensure secure, compliant AI outputs. These guardrails are integrated with the Databricks lakehouse for simplified observability, monitoring, and evaluation of your AI workloads.
What’s new in Unity AI Gateway: service policies, guardrails, observability, and cost controls for AI agents and MCPs
Unity AI Gateway now includes service policies, guardrails, observability, and cost controls for AI agents and MCPs. This enables Databricks practitioners to govern model calls and agent actions with consistent policies, full visibility, and enforceable cost controls for production AI.
How Deutsche Börse built a generative AI tool to tackle the large-scale migration of Zeppelin notebooks to Databricks
Deutsche Börse built a Databricks App to automate the migration of Zeppelin notebooks, reducing redevelopment time from hours to minutes. This solution tackles a large-scale migration challenge by handling structural conversion and generating AI-assisted prompts for reconstructing notebook logic.
Announcing the Databricks analytics engineer learning pathway
A new learning pathway for Databricks SQL practitioners is now available on Databricks Academy, covering skills to use the full SQL ETL toolkit for data modeling, pipelines, semantic layers, and conversational agents. Courses are offered in self-paced and instructor-led formats, and are included with any active Databricks Learning Subscription.
AI-assisted analytics engineering: Docusign’s framework for scaling dbt unit testing
Docusign reduced dbt unit test authoring from 5 hours to 30 minutes. Learn their AI-assisted framework for scaling dbt unit testing.
How NASDAQ built a governed intelligence layer with dbt and Databricks
NASDAQ built a governed intelligence layer using dbt and Databricks to process up to a trillion messages daily across 26 business lines. Learn why they chose this combination for their data architecture.
Ship smarter agents in production with dbt Agent Skills
Just like humans, autonomous agents need faster feedback loops. Here’s how to develop them.
The question your commercial data should already be able to answer
Databricks and Veeva now embed Genie AI agents and AI/BI dashboards directly into Veeva Vault CRM, enabling life sciences commercial teams to get real-time answers to their questions without leaving their workflow. This unified Databricks lakehouse with Unity Catalog delivers governed commercial data to every persona, from sales reps to MSLs, in the format and depth their role requires.
Week of May 11
17 articlesPipelineIQ: Forward‑Looking Sales Intelligence That Drives Action
PipelineIQ is a new AI solution that provides prescriptive "Next Best Actions" for sales reps and managers, shifting focus from retrospective forecasting to immediate, forward-looking guidance. It's built to work with imperfect CRM data, extracting signals like champion strength and procurement stalls to deliver clear outcomes: Walk, Pivot, or Accelerate for every opportunity.
Backstage with Lakebase, part 2
Lakebase enables running production OLTP applications like Backstage on a serverless Postgres surface within Databricks, offering 1-second database branching and sub-4-second point-in-time recovery for schema migrations. Unity Catalog unifies governance for operational databases, providing single SQL query auditing, automatic row-level security propagation to branches, and zero-ETL cost attribution for FinOps.
Expanded interoperability with Unity Catalog Open APIs
Unity Catalog Open APIs now offer expanded interoperability, with external access to UC managed Delta tables in Beta and credential vending generally available with M2M OAuth support. External engines like Apache Spark, Flink, and DuckDB can now create, read, and write to UC managed Delta tables, leveraging Delta Lake's new catalog commits feature for safe concurrent writes and audibility.
From manual to autonomous: how AI agents are transforming electric grid operations
AI agents are transforming electric grid operations from manual to autonomous control, addressing unprecedented demand and complexity. Hawaiian Electric saw a 60X reduction in document query times, from five minutes to five seconds, in just two weeks with early AI agent deployment.
Data quality is the AI strategy
Your AI strategy hinges on data quality, starting with fixes to transactional systems. Organizations prioritizing value creation with unified data will benefit most, as tools and models constantly evolve.
The Rosetta stone of CPS: Claroty’s AI-powered library
Claroty's AI-powered CPS Library, built on Databricks Custom Agents and Delta Lake, automates entity resolution for 17M+ industrial and healthcare assets, solving the asset identity crisis where 88% of CPS devices lack exact product codes. This multi-agent AI system improves vulnerability attribution accuracy by over 25% and provides new security recommendations for over 56% of analyzed devices.
Clinical operations intelligence belongs on the Lakehouse
The Site Feasibility Workbench, an open-source Databricks App, now enables clinical trial site selection entirely within the Databricks workspace, eliminating external API calls and synchronization pipelines. This solution addresses the architectural challenge of disconnected clinical operations data, improving enrollment target attainment with TA-segmented LightGBM models and auditable SHAP-driven explanations.
ABAC row filtering and column masking policies, governed tags, and data classification are now generally available in Unity Catalog
ABAC row filtering and column masking policies, governed tags, and data classification are now generally available in Unity Catalog. These capabilities unify data governance, eliminating manual security and ensuring consistent, real-time protection across your data estate.
The Rise of Sports Intelligence: How the Lakehouse Turns Tracking Data into Competitive Advantage
Pro teams now leverage the Lakehouse to transform exploding tracking and biomechanical data into sports intelligence, driving real-time decisions on the court, in training, and in the front office. The Databricks Data Intelligence Platform acts as the governed "sports brain," unifying diverse data with Lakeflow, Unity Catalog, ML, and AI Search to power proactive injury management, coaching insights, and next-gen fan experiences.
How CFOs in consulting can recover margin with Databricks
CFOs in consulting can recover margin by unifying finance data and automating workflows on Databricks. Early adopters are seeing tens of millions in DSO optimized, 80% faster reporting, and 50%+ faster close cycles.
Announcing Native Lakehouse Sync
Native Lakehouse Sync (Public Preview) now automatically replicates Lakebase Postgres data into Unity Catalog managed tables, eliminating pipelines and external compute. This enables live ML features, operational data as the Bronze layer with full SCD Type 2 history, and built-in audit capture, all with zero Postgres performance impact and no added cost.
Announcing Databricks Student Fellows
The Databricks Student Fellows Program is now live, offering students an exclusive opportunity to develop in-demand data, AI, and computer science skills. Fellows will gain hands-on experience with the Databricks platform, earn certifications, build projects, and unlock career pathways.
The Convergence of Open Table Formats and Open Catalogs: Catalog Commits is Generally Available
Catalog Commits take a big step forward in unifying the lakehouse by aligning Delta...
Faster Queries and New Capabilities with the Open-Source Databricks JDBC Driver
The new open-source Databricks JDBC driver delivers up to 30% faster large result retrieval and adds support for multi-statement transactions, stored procedures, Arrow compatibility, and Unity Catalog metric views. This fully owned, open-source driver enables faster fixes, external contributions, and tighter platform integration.
Unlocking the Archives: Turning Unstructured Documents into a Searchable Database for Groundwater Discovery
MapAid partnered with Databricks for Good to transform 700 scanned hydrogeological documents into a searchable database using multimodal AI. This serverless pipeline classifies documents and extracts water-related information, enabling researchers to quickly find historical studies and well records for improved groundwater prediction.
Predictive Quality Starts Where Defect Detection Stops
Databricks Genie now enables quality leaders to interrogate their full operational dataset using natural language, synthesizing data from inspection and supplier lots in a single query. This eliminates data latency from fragmented systems, shifting quality from reactive documentation to predictive intervention for reduced scrap and improved margins.
Retail markdown optimization: from reactive markdowns to proactive
Databricks Genie for Merchandise Intelligence now enables retail CMOs to move from reactive to proactive markdown optimization. It provides instant, natural language access to synthesize critical data like trends, inventory, and pricing, allowing for earlier trend detection and improved margins.
Week of May 4
25 articlesHow Superhuman and Databricks built a 200K QPS inference platform together
Superhuman migrated their 200K QPS custom LLM inference to Databricks FMAPI Provisioned Throughput, achieving sub-second P99 latency and offloading infrastructure management. Joint engineering delivered 60% per-GPU throughput gains and reduced serving costs through FP8 quantization and Hopper architecture optimizations.
Using MemAlign to Improve Evaluation of Traditional Machine Learning in Genie Code
MemAlign, an open-source MLflow framework, significantly improved the evaluation of traditional machine learning in Genie Code by reducing LLM judge error by 74-89% on key dimensions. This alignment was achieved with ~50 labeled examples, demonstrating the importance of both semantic and episodic memory for closing the gap between LLM judges and human experts.
Addressing HR's widening capacity gap with AI
HR's capacity gap is costing millions, but AI adoption can help. Databricks' lakehouse architecture paired with MathCo's NucliOS platform provides the secure, scalable foundation for phased AI transformation, from data consolidation to AI-powered workflows.
MCP Marketplace Brings Real-Time Intelligence to Agentic Applications
MCP Marketplace now provides real-time external intelligence for agentic applications, with partners like You.com and Moody's offering governed data. Lakebase and Genie enable end-to-end workflows, allowing agents to maintain context and surface decisions to business users for review.
Pushing the Frontier for Data Agents with Genie
Genie is Databricks’ state-of-the-art data agent designed for answering complex questions...
Energy trading analytics in a real-time market
Databricks Genie now provides energy traders and portfolio managers instant, conversational access to critical trading data, eliminating the structural revenue problem caused by analytical lags in a real-time market. This enables optimal decision-making in a highly volatile, data-intensive environment with 15-minute price changes.
Operating room utilization is hiding in your scheduling data
Databricks Genie for Surgical Operations Intelligence is now available to help healthcare operations leaders close the "Operational Intelligence Gap" by enabling real-time, conversational querying of surgical scheduling data. This allows for immediate intervention on OR utilization, shifting capacity management from reactive to proactive and addressing forgone revenue and unmet patient needs.
Why telecom churn prediction misses the intervention window
Databricks Genie for Retention Intelligence helps telecom leaders act on early churn signals by providing real-time, natural language query access to customer data. This enables timely interventions, addressing the "Velocity Problem" where traditional churn models often identify at-risk customers too late.
Growth Analytics Is What Comes After Growth Hacking
Databricks AI/BI Genie resolves the fragmented data architecture bottleneck for growth analytics, enabling leaders to conversationally interrogate unified acquisition, behavioral, and revenue data. This provides a structural competitive advantage through faster spend reallocation and learning, moving beyond growth hacking to data-driven precision.
Real-world evidence for medical affairs: who can actually use it?
Databricks Genie for Medical Affairs Intelligence helps close the RWE Fluency Gap by enabling conversational querying of complex scientific questions against RWE data. Medical Affairs leaders can now generate insights in seconds, a task that previously took data scientists days.
Wealth advisor productivity starts with the client conversation
Databricks Genie for Wealth Management Intelligence is now available, enabling wealth advisors to conversationally query client data for instant, contextual answers and remove information preparation burdens. This allows advisors to shift focus from information logistics to genuine insight during client portfolio reviews, improving productivity and the quality of client conversations.
How lakebase architecture delivers 5x faster Postgres writes
Lakebase architecture now delivers up to 5x faster Postgres write throughput for OLTP workloads by offloading crash-recovery tasks to distributed storage. This change reduces WAL traffic by 94% and improves read tail latency by 2x without compromising durability.
Why Talent Transformation Is the Missing Focus of Enterprise AI
Databricks Academy Pro is now available, offering unlimited self-paced, live, and hands-on training to accelerate enterprise AI adoption. Continuous workforce upskilling is critical for competitive advantage, and this new offering supports that journey.
Public health intelligence shouldn't require a data scientist
Databricks Genie now enables natural language access across public health datasets, delivering rapid, governed insights to improve decision-making and resource allocation. This means public health intelligence no longer requires a data scientist, accelerating real-time outbreak response and intervention.
Mean time to detect is a data access problem
Databricks Genie within Lakewatch enables natural language, agent-driven investigations to accelerate detection and response. This addresses the core issue of cross-system data integration that limits security analysts' investigation speed and effectiveness.
First-party audience data is the ad sales relationship now
Media companies can now leverage Databricks Genie for Ad Revenue Intelligence to close the "Insight Gap" and conversationally query their first-party data for audience insights. This enables VPs of Advertising Revenue to quickly understand audience composition and demonstrate post-campaign performance, crucial for winning RFPs against platforms with richer data.
Rethinking Distributed Systems for Serverless Performance and Reliability
Databricks' serverless compute required rethinking distributed systems to eliminate user-managed infrastructure and improve stability. Architectural innovations like separating applications from compute and intelligent workload routing deliver more stable, predictable, and cost-efficient performance.
The dbt Developer Agent is now in Preview: the coding agent for analytics engineering
The dbt Developer Agent is now in Preview, offering a coding agent for analytics engineering. It's grounded in your dbt project to help you ship faster without breaking downstream.
From Black Box to Observability: Tracing OpenClaw with MLflow
MLflow Tracing now provides full observability for OpenClaw agents, moving them from black box to transparent. Learn how to quickly set up tracing to understand why your agent makes specific decisions, rather than just seeing the output.
The AI Scaling Gap Hiding in Digital Native Companies
Digital natives lead on AI ambition but lag traditional industries in scaling it, according to new Economist data. Learn what's behind this AI scaling gap and why it matters.
10 trillion samples a day: Scaling beyond traditional monitoring infra at Databricks
Databricks now processes 10 trillion samples daily, scaling beyond traditional monitoring infrastructure by rearchitecting TSDB and aggregation layers with customized open-source solutions. A novel Lakehouse-based platform, Hydra, provides rich debugging capabilities for high-cardinality metrics at 50x cheaper storage.
AI success starts with clean data, not just better models
Databricks and Kraken are key to solving the foundational data challenge, providing unified, well-documented data essential for AI success. Organizations that pair this unified data with deep business context and a data-literate culture are pulling ahead, making data a business asset rather than just an IT platform.
How nOps Rebuilt Their Cloud Optimization Platform on Databricks Lakebase, and Why Other ISVs Should Too
nOps, a Databricks Built On partner managing over $4 billion in annual cloud spend,...
Peril Predicts: Precision Payouts for a Volatile World
Databricks now helps insurers operationalize parametric insurance workflows, enabling faster catastrophe payouts using objective event triggers. The Geospatial Lakehouse facilitates ingesting catastrophe data and analyzing exposure at scale, essential for reducing basis risk and defining accurate payout triggers with geospatial analytics and catastrophe modeling.
The foundation of AI scalability: one team, one platform, one operating model
Databricks now offers a unified platform and operating model to scale AI, addressing fragmentation with reusable accelerators and shared governance. This enables business teams to move 10x faster by fostering a culture of continuous learning and experimentation.
Week of Apr 27
34 articlesThe Federal Data Paradox: Rich in Data, Poor in Access
Databricks Genie enables federal agencies to overcome siloed, legacy data infrastructure by providing a natural language interface for governed, real-time data access. This empowers mission experts to make faster, evidence-based decisions without requiring a technologist for every routine task.
Driving Budapest Forward: How BKK Uses Databricks to Transform City Mobility
BKK now unifies city mobility data from buses, metros, trams, and shared vehicles on Databricks, creating a single view for analysts and decision-makers. This Lakehouse powers minute-level tracking, route performance analysis, and predictive modeling for operational efficiency and proactive urban planning.
LLM Vs AI: A Practical Guide to Differences, Use Cases, and Tools
LLMs are a subset of AI, and this guide clarifies their practical differences, use cases, and tools. Understand how LLMs fit into the broader AI landscape and what that means for your Databricks workflows.
Model Risk Governance Is Not the Same as Risk Intelligence
Databricks AI/BI Genie for Enterprise Risk Intelligence now enables conversational interrogation of governed risk data, providing instant, accurate answers for real-time risk management. This closes the intelligence gap for CROs who previously navigated complex model outputs and data systems to get specific answers on credit concentration or stress test sensitivity.
Generative AI for Business: A Complete Strategy and Implementation Guide
Generative AI is projected to add $2.6–4.4 trillion in annual economic value, with 75% concentrated in customer operations, marketing, software engineering, and R&D. Achieve durable business value by inventorying proprietary data, prioritizing high-impact pilots, and embedding AI into existing workflows with cross-functional teams and responsible AI practices like RAG.
Data Science vs Data Engineering: Choosing Analysis or Infrastructure
BI reporting bridges raw data and operational teams by collecting, analyzing, and presenting data in structured formats. Effective BI relies on clean, integrated data flowing through ETL pipelines into a central repository, supporting both managed and ad hoc reporting.
AI Applications: Tools, Use Cases, and Platforms
AI applications span four capability tiers, each with distinct data requirements and evaluation frameworks, and enterprise deployments often stall due to inadequate data infrastructure. Production-grade model development, from prompt engineering to pretraining, is increasingly accessible with open-source LLMs, but requires pre-built governance and monitoring infrastructure for successful deployment at scale.
MLOps vs DevOps: A Practical Guide for Data Scientists and IT Teams
MLOps extends DevOps by governing code, datasets, and model artifacts, adding Continuous Training pipelines to automatically retrain models when data drift exceeds thresholds. This guide details a three-layer model for successful MLOps, leveraging DevOps CI/CD, ML orchestrators, and unified monitoring to close the feedback loop.
Top Data Warehouse Tools For Modern Data Analytics
The lakehouse architecture is the modern standard for teams needing both analytics and AI, offering ACID-compliant reliability and open storage formats for SQL, streaming, ML, and AI on a single governed data foundation. Evaluate data warehouse tools across six dimensions, including query performance, scalability, and total cost of ownership, to avoid the hidden costs of maintaining separate systems.
Unlocking SAP Business Context in Databricks with Semantic Metadata Delta Sharing
SAP Business Data Cloud now automatically syncs semantic metadata, including descriptions and key relationships, into Unity Catalog, making SAP data instantly AI-ready and more discoverable. SAP PersonalData governance tags are also automatically available in Unity Catalog, enabling fine-grained access controls with ABAC.
The marketing activation gap has a fix: Databricks and Stitch partner to turn data infrastructure into marketing performance
Databricks and Stitch now partner to bridge the marketing activation gap, turning your data infrastructure into tangible marketing performance. Learn how Stitch connects your Databricks environment to marketing operations, with real-world examples from QSR, retail, and healthcare brands.
Alert Fatigue Is a Business Risk
Lakewatch and Databricks Genie unify data for agentic, machine-speed threat detection, triage, and response, directly addressing the business risk of alert fatigue in SOCs. This new approach helps overcome fragmented telemetry and legacy SIEM architectures that create signal-to-noise challenges and limit effective threat detection.
Backstage with Lakebase
For thirty years, the operational database and the analytical database have been...
Shipping Faster isn’t Learning Faster
Databricks AI/BI Genie for Product Intelligence ships to resolve the architectural bottleneck of slow data velocity in product organizations. VPs of Product can now instantly query complex behavioral data with conversational AI, eliminating weeks-long waits for analyst support and specialized skills.
Why Your OEE Dashboard Is Lying to You
Your OEE dashboard is likely misrepresenting operational reality because critical production data remains siloed and inaccessible. Databricks Genie provides a conversational AI layer over your unified data platform, making operational systems instantly answerable in natural language to empower real-time decision-making.
The Turbine That Tried to Tell You It Was Failing
Databricks Genie now offers a conversational AI layer for real-time operational metrics like OEE, directly accessible to VPs of Operations. This eliminates data access bottlenecks, enabling faster decision-making by shifting from reactive reports to proactive intelligence from SCADA and MES logs.
Predicting Readmissions Isn't Enough. Acting in Time Is.
Databricks Genie for Clinical Outcomes Intelligence now enables CMOs to conversationally query patient and outcomes data in natural language, providing immediate, governed insights to prevent predicted readmissions. This directly addresses the gap between readmission risk prediction and timely intervention by eliminating data request delays and matching clinical decision velocity.
Clinical Trials Run Longer Than They Have To. That's a Patient Problem
Databricks Genie for Clinical Trial Intelligence now enables clinical operations VPs to interrogate full trial data in natural language for instant answers. This allows earlier intervention on site performance issues, shortening trial timelines and improving patient access to treatment.
Network Quality Is a Revenue Problem, Not a Technical One
Databricks Genie for Network-Commercial Intelligence bridges the gap between network performance and commercial data, enabling Chief Network Officers to prioritize network quality decisions based on commercial impact. This helps reframe network quality as a revenue problem, not just a technical one, by connecting operational telemetry to commercial context like SLA exposure and churn propensity.
Approximate Answers, Exact Decisions: New Sketch Functions for Analytics
Databricks now offers new sketch functions for approximate answers to analytics questions, including KLL quantile sketches for percentiles, Theta and Tuple sketches for audience overlap, and approximate top-K functions for real-time trending. These functions enable faster, more memory-efficient computations over massive datasets, with mergeable sketches for incremental updates and combined counting and aggregation.
Companies Winning with AI Built the Data Layer First
Companies winning with AI, like Trinity improving delivery by 15% and ETA models by 50%, built a unified, governed, and accessible data layer first. Consolidating fragmented systems into a single architecture enables real-time AI, faster decisions, and lower costs.
Rethinking SQL ETL for modern data platforms
Databricks now offers a unified platform for all SQL ETL, removing coordination overhead and letting teams ship faster on one governed system. This approach addresses the fragmented SQL ETL that drives hidden cost, brittle pipelines, and slow incident resolution across warehouses, orchestrators, and tools.
Stripe data now available on Databricks via Databricks Marketplace
Stripe data is now available on Databricks Marketplace, enabling you to activate a Stripe data pipeline with Delta Sharing in minutes and instantly power AI applications. Share Stripe payment and business data directly into Unity Catalog to create a single source of truth and query live payment data for models, agents, and Genie workspaces.
Databricks and Stripe Projects: Infrastructure Built for Agents
Stripe Projects, a new agent-first CLI, now lets AI coding agents discover, provision, and pay for Neon Postgres databases directly with no human-in-the-loop. Databricks is a launch partner, enabling agents to spin up production-ready Postgres databases in seconds, backed by Lakebase's serverless architecture.
Agents are ready but your architecture probably isn't
Agentic systems are failing in production due to siloed data, poor governance, and analytics-focused infrastructure. CDOs and CTOs need a transactional database built for agents and a clear vision for success.
Interoperability Between Unity Catalog and Google BigQuery via Catalog Federation
Google Cloud now supports catalog federation to Unity Catalog, enabling BigQuery users to read tables in Unity Catalog without duplication. Unity Catalog also supports catalog federation to Google Cloud's Lakehouse, allowing it to read Iceberg tables written from BigQuery and other engines.
Built In, Not Bolted On: What AI-Native Actually Means in Cybersecurity
Databricks now offers a truly AI-native cybersecurity solution, architected with intelligence at its core and leveraging proprietary telemetry for a defensible advantage. This approach prioritizes defining shared outcomes for cross-functional alignment, rather than simply selecting shared tools.
Operationalizing AI for public sector fraud prevention
Databricks now offers a scalable approach to AI-powered fraud prevention for public sector agencies. Learn how clean data, automation, and real-time insights can be integrated into daily workflows to improve decision-making.
From months to minutes: Building real-time clinical data pipelines with natural language
Databricks and Redox now enable real-time clinical data pipelines from EHRs to Unity Catalog with natural language prompts, reducing integration time from months to minutes. This partnership allows AI outputs to be written back into the EHR in real time, transforming Databricks into an operational layer for point-of-care interventions.
Agentic Data Engineering with Genie Code and Lakeflow
Genie Code, an autonomous AI partner for data engineers, is now integrated directly into Lakeflow. Data engineers can leverage Genie Code within Lakeflow's Pipeline Editor and Jobs for the full data engineering lifecycle, from development and orchestration to monitoring and debugging.
Securely send first-party conversion signals with Snapchat Conversions API on Databricks Marketplace
Snapchat Conversions API is now on Databricks Marketplace, letting you send web, app, and offline events from your Lakehouse directly to Snapchat. Improve Event Match Quality and campaign measurement by deploying a pre-built notebook for governed, server-side event delivery.
See What Your AI Sees: Multimodal Tracing for Images, Audio, and Files
Databricks now supports multimodal tracing for images, audio, and files, allowing you to visualize and interact with these artifacts directly within your traces instead of opaque base64 strings. This enhancement improves debugging for GenAI agents, reduces storage costs, and speeds up trace queries by avoiding direct storage of large multimedia strings.
How leading tech companies are killing the builder’s tax with Lakebase
Databricks Lakebase is helping leading tech companies eliminate ETL and reverse ETL by unifying operational and analytical data. This enables real-time intelligence for apps and AI systems, operating directly on fresh, low-latency data.
Inside one of the first production deployments of Lakebase: LangGuard's agentic workflow governance engine
LangGuard's agentic workflow governance engine, one of the first production deployments of Lakebase, extends Unity Catalog and AI Gateway with runtime enforcement for autonomous AI agents. Lakebase provides the elastic, low-latency operational data layer for LangGuard's GRAIL™ data fabric, enabling real-time policy evaluation without impacting agent performance.
Week of Apr 20
18 articlesThe next generation of Databricks Genie
Genie now answers questions beyond Genie Spaces, connecting to external knowledge stores like Google Drive and SharePoint. This next generation of Genie, previously Databricks One, is available on web and native mobile apps.
Model Risk Management in 2026: A Banker’s Guide to the Revised Interagency Guidance
The new interagency guidance for model risk management, effective April 17, 2026, shifts to a risk-based, principles-driven framework, demanding a unified, governed lifecycle for all models, including GenAI, with evidence of governance generated automatically. Databricks offers a reference architecture to meet these revised expectations, integrating model development and validation into a single, auditable process.
OpenAI GPT-5.5 + Codex, now available and fully-governed in Databricks
GPT-5.5 and Codex are now natively available in Databricks, fully governed by Unity AI Gateway for permissions, cost controls, guardrails, and observability. This enables agent building with GPT-5.5 and natural language querying of enterprise data via Genie.
How Obie cut compute costs by 30%, reclaimed engineering hours, and built stronger governance
Databricks shipped dbt Fusion, a new engine and state-aware orchestration for dbt. Learn how Obie used it to cut compute costs by 30%, reclaim engineering hours, and build stronger governance.
Operational databases: How they work and when to use them
Databricks is introducing the "Lakebase," a new open architecture combining transactional database speed with data lake flexibility and economics, designed to overcome the limitations of traditional operational databases for modern unstructured data and AI workloads. This allows for real-time processing and concurrent transactions directly on the data lake, eliminating slow ETL pipelines and supporting diverse data types.
AI observability for production: Seeing Inside Your Multi-Agent System with MLflow
MLflow now offers enhanced AI observability for multi-agent systems, providing crucial visibility into their internal workings. This helps practitioners prevent unintended actions like data purges or sensitive information leaks in production.
Databricks partners with OpenAI on GPT-5.5
GPT-5.5 and Codex are coming soon to Databricks, governed by Unity AI Gateway, and cut OfficeQA Pro errors nearly in half. This partnership with OpenAI brings advanced models directly to Databricks users.
Announcing the Public Preview of Lakeflow Designer
Lakeflow Designer is now in Public Preview, offering a visual, no-code, AI-native interface for data preparation and analysis directly within Databricks. It leverages Unity Catalog for governance and generates production-ready code, providing step-by-step data previews for easier review of AI-generated transformations.
Are LLM agents good at join order optimization?
LLM agents can improve Databricks join order optimization, achieving 1.3x latency reduction in 80% of cases by reasoning through runtime statistics. This prototype demonstrates LLM agents' potential to act as data-driven DBAs, addressing cardinality misestimation challenges in complex SQL queries.
How conversational analytics removes the BI bottleneck
Databricks Genie and Lakebase are transforming BI by enabling conversational analytics with enterprise context, providing actionable insights beyond traditional dashboards. Operationalizing trusted AI-powered analytics, built on robust governance and semantic layers, is now crucial to avoid a competitive gap.
How to transform document activation workflows with Genie and Agent Bricks
Databricks shipped a solution combining AI/BI Genie, Agent Bricks, and Unity Catalog to automate document activation workflows. This enables multi-agent orchestration for extracting, processing, and activating data from diverse documents, improving efficiency and governance.
Using dbt with Databricks: Architecture decisions that determine success
Databricks users who skip dbt incur compounding costs. A solution architect explains key architecture decisions and when to act to ensure success.
Beyond the spreadsheet: how Databricks is delivering the modern CFO in Financial Services
Databricks now offers a unified architecture for Financial Services CFOs, integrating real-time data, AI modeling, and governance to eliminate data fragmentation and slow reporting. This enables a shift from reactive reporting to strategic finance, with benefits like drastically reduced regulatory reporting times and AI-powered natural language querying of complex financial data.
AI App Development: Guide To Building AI-Powered Apps
Databricks Apps and Lakebase are purpose-built platforms that streamline AI app development by eliminating infrastructure, authentication, and data synchronization overhead. A structured process covering model strategy, prompt design, agent orchestration, and data prep, combined with rigorous quality gates, ensures production-grade AI applications.
Structuring AI Evaluation and Observability with MLflow: From Development to Production
MLflow now offers enhanced tools for structuring AI evaluation and observability, including new APIs and UI features for logging LLM calls, prompts, responses, and metrics. This enables practitioners to systematically track, compare, and analyze model performance and behavior across development and production, facilitating iterative improvement and robust monitoring.
dbt Labs Wins a 2026 Google Cloud Partner of the Year Award
dbt Labs won a 2026 Google Cloud Partner of the Year Award, recognized for empowering thousands of Google BigQuery users to deliver trusted analytics and AI at scale.
Why metric definitions matter for reliable AI agents
Learn how dbt's semantic foundation enables reliable, governed agentic analytics.
Enforce Content Policies at the Gateway with AI Gateway Guardrails
MLflow AI Gateway now supports configurable guardrails, using LLM judges to block or sanitize harmful content, PII, and custom policy violations. Enforce content policies at the gateway before requests reach your users or models.
Week of Apr 13
5 articlesMeet Antigravity: Google’s agentic IDE enters the dbt orbit
Antigravity, Google's new agentic IDE, now integrates with dbt. This pairing promises to significantly improve developer productivity, potentially giving you your weekends back.
Exploring dbt and Google with AI agents
Learn how to build your first ddbt agent by plugging AI into a dbt project. This practical guide explores what happens when AI agents interact with dbt and Google.
Tableau and dbt MCPs together
Tableau and dbt MCPs can now be configured together in a single file. Learn how this pairing unlocks impact analysis, metric reconciliation, and more.
New dbt Labs Report Finds AI-driven Acceleration is Outpacing Trust and Governance
A new dbt Labs report finds AI is accelerating data workflows, but governance and trust aren't keeping pace. This press release details the findings on how AI-driven acceleration is outpacing trust and governance.
Week of Apr 6
3 articlesMistakes I made as the head of analytics (and what I’d do differently now)
A former head of analytics on the 6 mistakes he made with dbt—and what he'd do differently now.
Tired of Reviewing Traces? Meet Automatic Issue Detection for Your Agent
Automatic issue detection for your AI agent is now available, eliminating the need for manual trace reviews. This new feature helps you act on your observability data, improving the user experience beyond just recording logs, metrics, and traces.
How to Prevent Runaway Agent Costs with MLflow AI Gateway
MLflow AI Gateway now helps prevent runaway agent costs by providing visibility into which part of your agent is driving up costs. This allows you to identify and address cost drivers before investing in the wrong optimizations.
Week of Mar 30
3 articlesOperationalize analytics agents: dbt AI updates + Mammoth’s AE agent in action
Databricks now supports operationalizing analytics agents with dbt AI updates and Mammoth’s AE agent. Learn how to build context for LLM models using dbt and MCP servers.
Week of Mar 23
3 articlesIntroducing the dbt Community Champions Program
Building the future of analytics engineering, together.
Harness Your OpenHands Agent with AI Observability and Governance
MLflow now supports tracing, evaluating, and governing OpenHands agents, capturing every step of their autonomous operations. This enables practitioners to monitor agent actions, assess output quality, and manage LLM costs effectively.
Week of Mar 16
8 articlesTracking and Debugging AI Safety Evaluations with Inspect AI and MLflow
Inspect AI evaluations now integrate with MLflow for experiment tracking and execution tracing via the inspect-mlflow package. This enables practitioners to track and debug AI safety evaluations using familiar MLflow tools.
MLflow Workspaces: Shared Deployment Without Separate Servers
MLflow Workspaces are now available, enabling shared MLflow deployments across multiple teams by adding a logical organization and permission layer. This allows teams to scope experiments, models, traces, prompts, AI Gateway resources, and artifacts within their own workspace.
Types of data transformations for machine learning
Databricks practitioners can explore key data transformation types for machine learning, including cleaning, scaling, feature engineering, and validation. This Pulse post details these transformations to help optimize ML workflows.
What are the most common data pipeline architecture patterns?
Databricks practitioners can explore common data pipeline architecture patterns, including ETL, ELT, batch, streaming, and semantic layers. This post details the most prevalent patterns to help you understand their applications and differences.
Your Agents Need an AI Platform
MLflow 2.12 ships with new features for building and managing AI agents, including enhanced logging for agent traces, evaluation tools, and versioning capabilities. Leverage MLflow as your unified platform for developing, deploying, and governing reliable AI agents in production.
Control LLM Spend with AI Gateway Budget Alerts and Limits
AI Gateway now supports budget policies to control LLM spend with alerts and request limits. Set spending thresholds, receive webhook alerts, and automatically reject requests when budgets are exceeded.
Week of Mar 9
7 articlesThe Iceberg ecosystem today
Iceberg is production-ready, and this post details what Databricks practitioners can realistically expect when running on top of it today. Anders Swanson explains the current state of the Iceberg ecosystem.
Why metadata management is critical for modern data teams
Metadata management improves discovery, governance, performance, and trust in modern data systems.
Why ETL is still essential for modern data pipelines
ETL remains essential for modern data pipelines, consolidating fragmented data, enforcing quality, and satisfying compliance requirements. These core benefits are why ETL is still a critical component for modern organizations.
How a global investment firm reduced runtimes by 30–40% with the dbt Fusion engine
The dbt Fusion engine and State-Aware Orchestration helped a global investment firm reduce runtimes by 30-40% in 3 months. Learn how NBIM achieved these gains without heavy optimization efforts.
Effective strategies to enhance data quality management
Improve data quality with testing, metrics, automation, and a scalable governance framework.
Week of Mar 2
4 articlesEffective strategies to improve data quality across your organization
Databricks practitioners can improve data quality with proven strategies for testing, governance, and scalable analytics workflows. Learn how to implement these effective strategies across your organization.
How AI improves data lineage at scale
Discover how AI accelerates data lineage with automated docs, testing, and scalable governance.
Week of Feb 23
6 articlesShip LLM Agents Faster with Coding Assistants and MLflow Skills
MLflow now provides coding assistants with the required feedback loop to build better LLM agents. Trace, analyze, fix, validate, and repeat to ship LLM agents faster.
Deterministic Safety Checks in MLflow with Guardrails AI
MLflow evaluation pipelines now support fast, deterministic safety validation with Guardrails AI scorers. This enables adding safety checks without requiring an LLM.
Enterprise-Scale MLflow Operations and Security Practices at LY Corporation
How LY Corporation Uses MLflow: An Overview
Deploy MLflow Models to Serverless GPUs with Modal
MLflow models can now be deployed to Modal's serverless GPU infrastructure. This enables auto-scaling and streaming predictions for your MLflow models.
Multi-turn Evaluation & Simulation: Enhancing AI Observability with MLflow for Chatbots
MLflow 3.10 now supports multi-turn evaluation and conversation simulation, enabling scoring of full conversations and reproducible testing of agent changes. This helps catch failures that only emerge across multiple turns, improving chatbot observability.
Introducing MLflow AI Gateway: Governed, Observable Access to LLMs
MLflow AI Gateway provides a single, secure endpoint for all LLM providers, complete with usage tracking and native tracing. This new feature offers governed, observable access to LLMs for Databricks practitioners.
Week of Feb 9
1 article5 Tips to Get More Out of Your Claude Code with MLflow
MLflow now offers an MCP server, CLIs, and Skills to extend Claude Code, enabling you to trace tokens and monitor tool usage. These five tips will help you transform your Claude coding agent into a transparent and controllable workflow.
Week of Feb 2
1 articleMemAlign: Building Better LLM Judges From Human Feedback With Scalable Memory
MemAlign, a new framework for aligning LLMs with human feedback, is now available, offering competitive or better quality than state-of-the-art prompt optimizers at significantly lower cost and latency. It achieves this through a lightweight dual-memory system, making it a valuable tool for building better LLM judges.