News from the Databricks ecosystem.
Posts from databricks.com, MLflow, and dbt Labs — three trusted sources covering the platform, the open-source projects around it, and the data tooling layer most teams pair with it. Summarized for scanning.
Week of May 4
1 articleFrom Black Box to Observability: Tracing OpenClaw with MLflow
MLflow Tracing now provides full observability for OpenClaw agents, moving them from black box to transparent. Learn how to quickly set up tracing to understand why your agent makes specific decisions, rather than just seeing the output.
Week of Apr 27
1 articleSee What Your AI Sees: Multimodal Tracing for Images, Audio, and Files
Databricks now supports multimodal tracing for images, audio, and files, allowing you to visualize and interact with these artifacts directly within your traces instead of opaque base64 strings. This enhancement improves debugging for GenAI agents, reduces storage costs, and speeds up trace queries by avoiding direct storage of large multimedia strings.
Week of Apr 20
3 articlesAI observability for production: Seeing Inside Your Multi-Agent System with MLflow
MLflow now offers enhanced AI observability for multi-agent systems, providing crucial visibility into their internal workings. This helps practitioners prevent unintended actions like data purges or sensitive information leaks in production.
Structuring AI Evaluation and Observability with MLflow: From Development to Production
MLflow now offers enhanced tools for structuring AI evaluation and observability, including new APIs and UI features for logging LLM calls, prompts, responses, and metrics. This enables practitioners to systematically track, compare, and analyze model performance and behavior across development and production, facilitating iterative improvement and robust monitoring.
Enforce Content Policies at the Gateway with AI Gateway Guardrails
MLflow AI Gateway now supports configurable guardrails, using LLM judges to block or sanitize harmful content, PII, and custom policy violations. Enforce content policies at the gateway before requests reach your users or models.
Week of Apr 6
2 articlesTired of Reviewing Traces? Meet Automatic Issue Detection for Your Agent
Automatic issue detection for your AI agent is now available, eliminating the need for manual trace reviews. This new feature helps you act on your observability data, improving the user experience beyond just recording logs, metrics, and traces.
How to Prevent Runaway Agent Costs with MLflow AI Gateway
MLflow AI Gateway now helps prevent runaway agent costs by providing visibility into which part of your agent is driving up costs. This allows you to identify and address cost drivers before investing in the wrong optimizations.
Week of Mar 23
2 articlesHarness Your OpenHands Agent with AI Observability and Governance
MLflow now supports tracing, evaluating, and governing OpenHands agents, capturing every step of their autonomous operations. This enables practitioners to monitor agent actions, assess output quality, and manage LLM costs effectively.
Week of Mar 16
4 articlesTracking and Debugging AI Safety Evaluations with Inspect AI and MLflow
Inspect AI evaluations now integrate with MLflow for experiment tracking and execution tracing via the inspect-mlflow package. This enables practitioners to track and debug AI safety evaluations using familiar MLflow tools.
MLflow Workspaces: Shared Deployment Without Separate Servers
MLflow Workspaces are now available, enabling shared MLflow deployments across multiple teams by adding a logical organization and permission layer. This allows teams to scope experiments, models, traces, prompts, AI Gateway resources, and artifacts within their own workspace.
Your Agents Need an AI Platform
MLflow 2.12 ships with new features for building and managing AI agents, including enhanced logging for agent traces, evaluation tools, and versioning capabilities. Leverage MLflow as your unified platform for developing, deploying, and governing reliable AI agents in production.
Week of Mar 2
2 articlesWeek of Feb 23
6 articlesShip LLM Agents Faster with Coding Assistants and MLflow Skills
MLflow now provides coding assistants with the required feedback loop to build better LLM agents. Trace, analyze, fix, validate, and repeat to ship LLM agents faster.
Deterministic Safety Checks in MLflow with Guardrails AI
MLflow evaluation pipelines now support fast, deterministic safety validation with Guardrails AI scorers. This enables adding safety checks without requiring an LLM.
Enterprise-Scale MLflow Operations and Security Practices at LY Corporation
How LY Corporation Uses MLflow: An Overview
Deploy MLflow Models to Serverless GPUs with Modal
MLflow models can now be deployed to Modal's serverless GPU infrastructure. This enables auto-scaling and streaming predictions for your MLflow models.
Multi-turn Evaluation & Simulation: Enhancing AI Observability with MLflow for Chatbots
MLflow 3.10 now supports multi-turn evaluation and conversation simulation, enabling scoring of full conversations and reproducible testing of agent changes. This helps catch failures that only emerge across multiple turns, improving chatbot observability.
Introducing MLflow AI Gateway: Governed, Observable Access to LLMs
MLflow AI Gateway provides a single, secure endpoint for all LLM providers, complete with usage tracking and native tracing. This new feature offers governed, observable access to LLMs for Databricks practitioners.
Week of Feb 9
1 article5 Tips to Get More Out of Your Claude Code with MLflow
MLflow now offers an MCP server, CLIs, and Skills to extend Claude Code, enabling you to trace tokens and monitor tool usage. These five tips will help you transform your Claude coding agent into a transparent and controllable workflow.
Week of Feb 2
1 articleMemAlign: Building Better LLM Judges From Human Feedback With Scalable Memory
MemAlign, a new framework for aligning LLMs with human feedback, is now available, offering competitive or better quality than state-of-the-art prompt optimizers at significantly lower cost and latency. It achieves this through a lightweight dual-memory system, making it a valuable tool for building better LLM judges.