03From Brickster.ai: We mapped 84 startups in the Databricks ecosystem
01
🤖 The MLflow story
MLflow now routes your coding agent
The tool you used to log model metrics is now the control plane for AI agents.
The tool your data scientists use to log training runs can now babysit your coding agent. That sounds like a category error, but it shipped in May: MLflow can route Claude Code sessions through its AI Gateway, with full tracing, budget caps, and guardrails on every call. Claude Code itself needs no changes. You stand up the gateway once, set two environment variables, and MLflow records every call.
It's a bigger shift than the release note lets on. The tool that started life logging machine-learning experiments now does the unglamorous work that keeps agents safe in production: tracing multi-agent systems, catching runaway costs before they land, scoring output with LLM-as-judge evaluations. Same project, a completely different job.
The strategic read: as agents move from demo to production, somebody has to own the boring layer that makes them safe to run. What did they cost? Why did they do that? Can I prove it to an auditor? Databricks is betting MLflow is where teams answer those questions, the same way it became the default answer for "which model version is in production." The agent era needs a flight recorder. MLflow is quietly applying for the job.
🔄 Same project, new job
2018: experiment tracker
"Log my training runs"
MLflow tracked classical ML:
Params, metrics, artifacts
Run comparison UI
Model packaging
Model Registry (2019)
2026: agent control plane
"Govern my agents"
MLflow now governs agents:
AI Gateway routing (Claude Code)
Multi-agent tracing
Cost guardrails + budgets
LLM-as-judge evaluation
📖 How it got here
The logging library that became an industry standard
None of this was the plan. When Databricks open-sourced MLflow on June 5, 2018, the ambition stopped at one unglamorous problem: machine learning had become a swamp of untracked experiments, and nobody could remember which run produced which number. The first alpha shipped three pieces and called it done: Tracking, Projects, and Models.
It caught on fast. MLflow cleared half a million monthly downloads within its first year, and the Model Registry followed in late 2019. Then in 2020 Databricks did the unusual thing and gave it away, donating MLflow to the Linux Foundation so it couldn't be read as one vendor's lock-in. That move is the whole story in miniature: win the standard, don't own the cage.
Seven years on, MLflow pulls more than 30 million downloads a month and ships in everything from Hugging Face to, yes, Snowflake. The bet paid off in adoption. Whether it pays off in revenue is the more interesting question, and 2026's pivot to agents is Databricks' answer.
🚀 MLflow, seven years
2018
The alpha
Open-sourced June 5 with three pieces: Tracking, Projects, Models. A standard way to log ML experiments.
2019
1.0, Registry, and scale
MLflow 1.0 lands in June, past 500K monthly downloads one year in. The Model Registry arrives in October.
2020
Linux Foundation
Donated to the Linux Foundation June 25 to become a vendor-neutral standard, past 2.5M monthly downloads.
2022
The standard
MLflow 2.0. Embedded across the ecosystem, from Hugging Face to competing platforms like Snowflake.
2024
GenAI turn
Tracing, prompt management, and LLM evaluation arrive. MLflow starts treating GenAI as a first-class workload.
2026
Agent control plane
MLflow 3.x. AI Gateway routes Claude Code, multi-agent tracing, cost guardrails, LLM-as-judge. A TypeScript SDK lands for non-Python agents.
💡 The standard, not the cage
The adoption is not in question. The strategy underneath it is the interesting part.
🎁 The open-source play
Databricks built MLflow, then handed it to the Linux Foundation. Why give away the thing everyone uses? Because owning the standard is worth more than owning the tool. When MLflow is the default way the whole industry tracks models, Databricks gets two things money can't easily buy: its own design choices steer where the workflow goes next, and the managed version on Databricks stays the smoothest place to run it.
It's the same move Google made with Kubernetes: open the core, sell the best hosted version of it. The 2026 agent pivot runs that playbook again. Make MLflow the neutral standard for agent observability, and the most frictionless place to do it at scale happens to be the Data Intelligence Platform.
2018
First release
30M+
Monthly downloads
3.x
Now: agents
🔭 What this means now
If you already use MLflow for models, you have a head start on agent governance you may not realize. The same logging, comparison, and registry habits map onto tracing agent runs, comparing prompt versions, and gating which agent reaches production. The tool you know grew into the problem you now have.
The open question is whether the agent era stays as winnable for MLflow as the model era was. LangSmith, Arize, and a dozen others want the same observability seat. MLflow's edge is the install base and the neutrality. Seven years of being the place teams already log things is hard to dislodge. If you haven't decided how your agents get monitored, it's worth deciding before a vendor decides for you.
02
📊 This week, in brief
Three other threads ran through the brickster.ai archive this week, across releases, news, videos, and community Q&A. Full breakdown and item-level reading over at brickster.ai/digest.
Open Formats
Iceberg goes fully GA in Unity Catalog
The week's other big story. Managed Iceberg, Iceberg v3, and Foreign Iceberg all hit GA, with the Iceberg REST Catalog API for any engine. The roadmap note matters more than the launch: Iceberg v4 and Delta 5.0 are slated to converge on one metadata structure.
Databricks' managed Postgres added database branching for evolutionary schema development, and the community got a candid engineering talk on how it survives cloud failures. Plus Unity Catalog to Lakebase sync tables in triggered mode. The OLTP side of the platform is filling in fast.
The new Lakeflow Designer brings a no-code drag-and-drop UI with Genie-powered code generation to pipeline building, aimed at analysts who don't write Spark. Databricks SQL also shipped engine upgrades worth a read if you run heavy DBSQL workloads.
84 companies building on, for, and around Databricks. One filterable page.
There was no good single map of the companies in the Databricks orbit, so we built one. The new /startups page catalogs 84 companies in three buckets: built for Databricks (tools sold into customers), built on Databricks (apps shipping their workloads on the platform), and ecosystem-adjacent (lakehouse-native with a real integration). Filter by category, subcategory, or funding stage.
The pattern that jumps out: 10 of the 84 don't exist independently anymore. Databricks bought them. MosaicML became Mosaic AI, Tabular brought the Iceberg creators in-house, Neon now powers Lakebase. The map of who's in the ecosystem doubles as a map of where the product is heading. Every entry cites the public source for its Databricks tie.