MLflow
Recent items mentioning MLflow across the Databricks ecosystem — releases, news, videos, and community Q&A. Updated hourly.
MLflow continues to enhance its GenAI capabilities, with the 3.14.0 release introducing agent onboarding, durable tracing for Claude Code, review queues for traces, and a new LLM Playground for prompt iteration 2. Observability for AI agents is a key focus, with Omnigent now providing automatic multi-layer observability via MLflow Tracing 1, and Databricks enabling OpenTelemetry traces from any AI agent to Unity Catalog for end-to-end observability and governance using MLflow 7. Additionally, MLflow now offers Role-Based Access Control (RBAC) to manage LLM teams by defining reusable roles and enforcing fine-grained permissions across prompts, experiments, and AI Gateway resources 4.
Generated daily from the 7 most recent items mentioning MLflow. Click any [N] to jump to the source.
Multi-Harness AI Agents Need Multi-Layer Observability: Omnigent in MLflow
Omnigent unifies multi-harness agent orchestration and now delivers automatic observability across every agent with MLflow Tracing, requiring no code changes. This post details how Omnigent in MLflow provides multi-layer observability for multi-harness AI agents.
MLflow 3.14.0 introduces new GenAI features including agent onboarding, durable tracing for Claude Code, review queues for traces, and a revamped evaluation dataset UI. It also includes a new LLM Playground for prompt iteration and changes the default serialization format for several MLflow model flavors to `skops` or `pt2`.
NewsGenie Spaces or Genie Code? Databricks AI Explained
Databricks offers two AI assistants: Genie Spaces for business users to get data answers via natural language queries and multi-task investigations, and Genie Code for technical professionals to build dashboards, write code, generate pipelines, and debug GenAI apps. Spaces is for asking questions, while Code is for building and developing within the Databricks platform.
How to Manage your LLM Teams using MLflow's Role-Based Access Control
MLflow's new Role-Based Access Control (RBAC) helps LLM teams define reusable roles, isolate workspaces, and enforce fine-grained permissions across prompts, experiments, and AI Gateway resources. Learn how to manage your LLM teams using these new MLflow RBAC capabilities.
EventsDatabricks News: CLI v 1.0.0, AI-tools, databricks Docker, DABs UI sync, mutators
The video demonstrates new Databricks features, including the GA release of CLI 1.0.0, UI sync for DABs, Python mutators for bundle extension, and new Docker image options for custom runtimes. It also covers serverless pipeline orchestration, enhanced autoscaling for Lakebase and apps, serverless interactive execution timeout, and auto-scoping for access tokens.
How Ecolab rebuilt retail intelligence on Databricks and Anthropic Claude
Ecolab rebuilt retail intelligence on Databricks and Anthropic Claude, converting 700-page FDA manuals into real-time answers for frontline staff using Foundation Model APIs and cutting compliance report compilation from two weeks to under two minutes. The solution, a native Databricks App with Lakebase Postgres and Unity Catalog, unifies nine siloed data sources and employs a multi-agent orchestration framework with Judge LLMs and MLflow tracing for personalized, continuously refined intelligence.
TutorialsTrace Any AI Agent with OTel, MLflow, and Unity Catalog
Databricks now allows sending OpenTelemetry traces from any AI agent to Unity Catalog, enabling end-to-end observability and governance within the Databricks Lakehouse. This integration facilitates cost-effective trace storage, offline analytics, production monitoring, and continuous agent evaluation using MLflow.
NewsBanks' Secret Weapon Against Money Laundering: Multi-Agent AI
Databricks demonstrates a multi-agent AI solution for Anti-Money Laundering (AML) operations, significantly reducing false positives and accelerating investigation cycles from hours to minutes. The platform unifies siloed systems, employs specialized AI agents for analysis and recommendations, and offers AI-assisted SAR generation and executive-level reporting with natural language chat.
MLflow 3.13.0 introduces a new Role-Based Access Control system with an Admin UI for managing users and permissions, alongside trace retention and auto-archival to object storage. This release also includes one-click observability for coding agents, new engines for MLflow Assistant, and an official Helm chart for Kubernetes deployments.
Multi-Agent Supervisor for Hybrid Retrieval with Agent Bricks and MLflow
Context Engineer Associate Beta Ex︁am + free attempt at DAIS
Context engineering is quickly becoming one of the key skills for building reliable AI agent systems. Databricks has just introduced the **Databricks Context Engineer Associate** **Ex︁am**, focused on designing, assembling, and governing the information AI agents receive at inference time - including prompts, retrieval systems, memory, tools, governance, and evaluation. The ex︁am is currently available as a **live beta at Data + AI Summit 2026**, and Databricks states that **one free onsite exam attempt will be offered during Summit**. Walk-ins only, one per attendee. Great opportunity for anyone working with GenAI, AI agents, Vector Search, Unity Catalog, MLflow, MCP, or Lakebase. [https://www.databricks.com/learn/certification/context-engineer-associate](https://www.databricks.com/learn/certification/context-engineer-associate)
Route Claude Code Through MLflow AI Gateway
MLflow AI Gateway now supports routing Claude Code, providing full observability, budget controls, and guardrails for all your coding agent sessions. This integration requires no changes to your existing Claude Code usage.
TutorialsBuilding Trustworthy, High-Quality AI Agents with MLflow
Databricks' MLflow platform helps developers build trustworthy, high-quality AI agents by providing tools for end-to-end observability, evaluation, prompt management, and AI gateway governance. It demonstrates how MLflow facilitates tracing, expert feedback collection, automated issue detection with LLM judges, prompt optimization, and continuous monitoring throughout the agent development lifecycle.
TutorialsBuilding Enterprise-Ready Agents using Agent Bricks
Databricks Agent Bricks is a unified platform designed to help enterprises build and manage AI agents, addressing challenges like low-quality reasoning on proprietary data, lack of governance, and fragmented toolchains. It demonstrates how to create knowledge assistants for unstructured data and AI Genies for structured data, integrating with Unity Catalog for governance and MLflow for observability and evaluation.
MLflow now features a major overhaul of Role-Based Access Control with a new Admin UI and unified permission APIs. It also introduces end-to-end trace archival, Helm charts for Kubernetes deployment, and a new API for stress-testing GenAI agents.
From "What Happened?" to "What Will Happen?"
Conversational BI now delivers predictive answers in seconds, not days, by fusing Genie for dynamic feature engineering with TabPFN for zero-training prediction, orchestrated by Agent Bricks. This self-assembling pipeline eliminates data science bottlenecks for business users, providing a governed experience backed by Unity Catalog and MLflow.
EventsDatabricks News: Lakeflow Designer, UV package manager, DABs templates, Genie scheduled tasks
Databricks introduces Lakeflow Designer for visual data preparation, though its generated code is messy; a workaround uses Genie to convert the visual workflow into clean PySpark/SQL notebooks. The UV package manager significantly speeds up package installations on Databricks serverless runtimes, and DABs templates allow for standardized, customizable Databricks Asset Bundles.
TutorialsHow to Build an AI Security Governance Hub with Agent Bricks
Databricks Agent Bricks enables building an AI Security Governance Hub by transforming static security playbooks into adaptive multi-agent systems. The video demonstrates combining a knowledge assistant for unstructured documents and a Genie space for structured data into a supervisor agent, then details how to tune and monitor these agents for improved performance and data privacy.
EventsBuilding Trustworthy, High-Quality AI Agents with MLflow
MLflow provides a comprehensive platform for building, evaluating, and deploying high-quality AI agents, offering tools for observability, automated evaluation, prompt optimization, and production monitoring. It enables developers to streamline the agent development lifecycle, from prototyping and testing with human and AI judges to fixing issues and ensuring reliable, governed deployment.
Using MemAlign to Improve Evaluation of Traditional Machine Learning in Genie Code
MemAlign, an open-source MLflow framework, significantly improved the evaluation of traditional machine learning in Genie Code by reducing LLM judge error by 74-89% on key dimensions. This alignment was achieved with ~50 labeled examples, demonstrating the importance of both semantic and episodic memory for closing the gap between LLM judges and human experts.
From Black Box to Observability: Tracing OpenClaw with MLflow
MLflow Tracing now provides full observability for OpenClaw agents, moving them from black box to transparent. Learn how to quickly set up tracing to understand why your agent makes specific decisions, rather than just seeing the output.
MLflow 3.12.0 introduces multimodal tracing, allowing storage and rich rendering of PDFs, audio, and images as artifact attachments in tracing spans. It also adds AI Gateway guardrails to prevent unsafe model inputs/outputs and extends coding agent tracing support to Codex, Gemini, and Qwen.
NewsDatabricks News: watermark-based incremental ingestion, MCP in AI gateway, Genie, Vector Search
Databricks now offers watermark-based incremental ingestion from SQL databases without change data feed, allowing for efficient data updates and soft deletion handling. The AI Gateway supports custom MCP servers, enabling integration with external APIs like GitHub for enhanced AI application development.
MLflow 3.12.0rc0 introduces enhanced AI agent development features, including automatic tracing for more AI coding assistants and OpenClaw, along with new AI Gateway guardrails for safety checks. It also adds multimodal trace attachments for viewing images, audio, and files in the UI, and a new `mlflow.diffusers` flavor for saving and serving diffusion models.
AI observability for production: Seeing Inside Your Multi-Agent System with MLflow
MLflow now offers enhanced AI observability for multi-agent systems, providing crucial visibility into their internal workings. This helps practitioners prevent unintended actions like data purges or sensitive information leaks in production.
CommunityFrom Notebook to Production: MLOps Quickstart
The video demonstrates how to apply MLOps best practices on Databricks using a quickstart repository, covering data ingestion, feature preprocessing, model training, deployment, and inference. It showcases Databricks tools like MLflow and Unity Catalog for managing the ML lifecycle, including version control, experiment tracking, model governance, and automated deployment across development and production environments.
Structuring AI Evaluation and Observability with MLflow: From Development to Production
MLflow now offers enhanced tools for structuring AI evaluation and observability, including new APIs and UI features for logging LLM calls, prompts, responses, and metrics. This enables practitioners to systematically track, compare, and analyze model performance and behavior across development and production, facilitating iterative improvement and robust monitoring.
Enforce Content Policies at the Gateway with AI Gateway Guardrails
MLflow AI Gateway now supports configurable guardrails, using LLM judges to block or sanitize harmful content, PII, and custom policy violations. Enforce content policies at the gateway before requests reach your users or models.
NewsDatabricks Apps vs Model Serving: Authentication, Cost, and Performance Compared
Databricks Apps are now the recommended first choice for deploying agents due to their flexibility in handling full-stack applications with multiple components, offering faster iteration and local testing compared to Model Serving. Model Serving remains suitable for use cases prioritizing high QPS, governance features like AI Gateway, inference tables, and guardrails, or when scaling to zero is acceptable for cost optimization.
TypeScript SDK 0.2.0 RC1
Release candidate for `@mlflow/vercel` TypeScript package with version 0.2.0: https://github.com/mlflow/mlflow/pull/22105
NewsDatabricks News: AUTO CDC, Workspace skills, Ask Genie, and Type widening
Databricks introduces Auto CDC for efficient change data feed processing, notebook and govern tags for better organization, and workspace skills for Ask Genie to customize its responses. Databricks also adds type widening for streaming tables, allowing data types to automatically adjust to larger incoming values.
How to Prevent Runaway Agent Costs with MLflow AI Gateway
MLflow AI Gateway now helps prevent runaway agent costs by providing visibility into which part of your agent is driving up costs. This allows you to identify and address cost drivers before investing in the wrong optimizations.
MLflow 3.11.1 introduces AI-powered issue detection for agent traces, budget alerts and limits for AI Gateway spending, and a new interactive graph view for visualizing trace hierarchies. It also enhances security with pickle-free model serialization and improves dependency management with native UV support.
NewsDatabricks News: Excel add-in, Metrics Views UI, and Quality Monitoring
Databricks announced Lake Watch for cybersecurity, new dynamic dropdown filters in SQL editor, and improved quality monitoring with null value scanning and automated alerts. The video also demonstrates a new UI for defining metric views, an Excel add-in for data preview and import, and the ability to publish dashboards as public web pages.
Harness Your OpenHands Agent with AI Observability and Governance
MLflow now supports tracing, evaluating, and governing OpenHands agents, capturing every step of their autonomous operations. This enables practitioners to monitor agent actions, assess output quality, and manage LLM costs effectively.
NewsDatabricks News: Free Tier, Multi-statement transactions, Declarative Automation Bundles, Genie Code
Databricks now offers a free tier for Lakeflow Connect, providing 100 DBUs per day per workspace, and has introduced multi-statement transactions in Unity Catalog that ensure atomicity with rollback capabilities. The platform also announced a Databricks One mobile app, a new AI runtime with pre-installed tools for GPU use cases, and enhanced Genie Code that understands project structure for automated development tasks. Additionally, Databricks Asset Bundles are now called Declarative Automation Bundles and use a faster direct engine, and a new 5X-Large SQL warehouse is available for processing terabytes of data.
Testing and Refining Claude Code Skills with MLflow
MLflow tracing and LLM judges can now test Claude Code skills. This enables a self-improvement loop where Claude Code refines its own abilities.
Tracking and Debugging AI Safety Evaluations with Inspect AI and MLflow
Inspect AI evaluations now integrate with MLflow for experiment tracking and execution tracing via the inspect-mlflow package. This enables practitioners to track and debug AI safety evaluations using familiar MLflow tools.
MLflow Workspaces: Shared Deployment Without Separate Servers
MLflow Workspaces are now available, enabling shared MLflow deployments across multiple teams by adding a logical organization and permission layer. This allows teams to scope experiments, models, traces, prompts, AI Gateway resources, and artifacts within their own workspace.
Your Agents Need an AI Platform
MLflow 2.12 ships with new features for building and managing AI agents, including enhanced logging for agent traces, evaluation tools, and versioning capabilities. Leverage MLflow as your unified platform for developing, deploying, and governing reliable AI agents in production.
Control LLM Spend with AI Gateway Budget Alerts and Limits
AI Gateway now supports budget policies to control LLM spend with alerts and request limits. Set spending thresholds, receive webhook alerts, and automatically reject requests when budgets are exceeded.
This release introduces AI-powered issue identification for agent traces, budget alerts and limits for AI Gateway spending, and an interactive graph view for trace hierarchies. It also includes native OpenTelemetry GenAI convention support, Opencode tracing integration, UV package manager support, and pickle-free model serialization options for enhanced security.
NewsDatabricks News: unit testing, OneLake federation, scoped access tokens
Databricks now allows creating Unity Catalog domains for business users, running JAR tasks on serverless compute, and federating OneLake data directly into Databricks. The platform also introduces in-workspace Python unit testing, new data connectors like HubSpot and TikTok Ads, and scoped personal access tokens for enhanced security.
This release adds a "try-it" page for Gateway usage examples and filters gateway experiments from the experiment list in the UI. It also fixes numerous UI issues, artifact download problems, and tracing errors, including issues with model copying across workspaces.
Agent Trace Evaluation with TruLens Scorers in MLflow
Evaluate agent traces with TruLens GPA framework through mlflow.genai.evaluate(). Score agent plans, tool calls, and reasoning directly within MLflow.
Benchmark Your Way to Better RAG and Agents:Tuning Vector Search with MLflow
High-level summary: problems, approaches, and takeways for better RAG with MLflow
NewsDatabricks News: Catalog and External locations in DABS, Schema Evolution, File Events, Queries Tags
Databricks Runtime 18.1 introduces schema evolution for inserts, managed file events for Autoloader, and a simplified `TABLE` syntax for querying. The video also demonstrates new features like the AI Gateway for LLM governance, query tags for tracking, and the GA release of the supervisor agent.
Ship LLM Agents Faster with Coding Assistants and MLflow Skills
MLflow now provides coding assistants with the required feedback loop to build better LLM agents. Trace, analyze, fix, validate, and repeat to ship LLM agents faster.
Deterministic Safety Checks in MLflow with Guardrails AI
MLflow evaluation pipelines now support fast, deterministic safety validation with Guardrails AI scorers. This enables adding safety checks without requiring an LLM.
Enterprise-Scale MLflow Operations and Security Practices at LY Corporation
How LY Corporation Uses MLflow: An Overview
Deploy MLflow Models to Serverless GPUs with Modal
MLflow models can now be deployed to Modal's serverless GPU infrastructure. This enables auto-scaling and streaming predictions for your MLflow models.
Multi-turn Evaluation & Simulation: Enhancing AI Observability with MLflow for Chatbots
MLflow 3.10 now supports multi-turn evaluation and conversation simulation, enabling scoring of full conversations and reproducible testing of agent changes. This helps catch failures that only emerge across multiple turns, improving chatbot observability.
Introducing MLflow AI Gateway: Governed, Observable Access to LLMs
MLflow AI Gateway provides a single, secure endpoint for all LLM providers, complete with usage tracking and native tracing. This new feature offers governed, observable access to LLMs for Databricks practitioners.
MLflow 3.10.0 introduces multi-workspace support for organizing experiments and models, alongside new GenAI features like multi-turn evaluation, LLM cost tracking, and AI Gateway usage analytics. The UI has been redesigned for improved navigation, and in-UI trace evaluation is now available.
MLflow now supports multi-workspace environments for organizing experiments and resources, alongside a new top-level navigation split for GenAI and Classical ML workflows. Key new features include multi-turn conversation simulation, automatic LLM trace cost tracking, AI Gateway usage analytics, and a CLI command to generate a demo environment.
NewsDatabricks Breaking News: 2026 Week 6: 2 February 2026 to 8 February 2026
Databricks introduces agentic data quality monitoring with anomaly detection, LLM judge UI builder for MLflow, and new SQL warehouse features including a default option and activity details. The platform also enhances its assistant to connect with MCP servers, improves Google Sheets integration with pivot table functionality, and adds direct Git deployment and tagging for Databricks apps.
5 Tips to Get More Out of Your Claude Code with MLflow
MLflow now offers an MCP server, CLIs, and Skills to extend Claude Code, enabling you to trace tokens and monitor tool usage. These five tips will help you transform your Claude coding agent into a transparent and controllable workflow.
NewsDatabricks Breaking News: 2026 Week 5: 26 January 2026 to 1 February 2026
Databricks now allows triggering materialized views or streaming tables on update, automatically detecting source changes and refreshing the pipeline. MLflow traces can now be stored in Unity Catalog using OpenTelemetry, providing a centralized logging system for experiment data.
v.3.9.0
MLflow 3.9.0 introduces an in-product MLflow Assistant chatbot and a Trace Overview Dashboard for GenAI experiments, enhancing debugging and performance insights. The AI Gateway is revamped for direct tracking server integration, alongside new LLM judge features for online monitoring and custom prompt building.