Model Serving
Recent items mentioning Model Serving across the Databricks ecosystem — releases, news, videos, and community Q&A. Updated hourly.
Databricks Model Serving has seen significant enhancements, including new capabilities for real-time ML with streaming features and high-QPS serving 4, and expanded collaboration with NVIDIA for an end-to-end AI platform that includes Model Serving enhancements 3. Customers like ERGO Hestia are leveraging Mosaic AI Model Serving to reduce time-to-market for real-time pricing engines 5, while the community is actively discussing best practices for scaling, cost, and latency in production 17 and support for external APIs like Azure OpenAI 6. The Databricks CLI also received fixes for persistent drift on model serving endpoints 2.
Generated daily from the 7 most recent items mentioning Model Serving. Click any [N] to jump to the source.
Databricks AI Model Serving in production: scaling, cost, and latency lessons
The `ssh connect` command now supports a `--base-environment` flag for custom serverless session environments. Bundles received fixes for persistent drift in model serving endpoints and spurious updates with `apply_policy_default_values` on job tasks.
Databricks and NVIDIA: Building for the Agentic Era
Databricks and NVIDIA are expanding their collaboration to deliver an end-to-end AI platform, accelerating model training, inference, and agentic AI development on governed enterprise data. This includes multinode training in AI Runtime, GPU support in Databricks Free Edition, Model Serving Enhancements, and support for NVIDIA Agent Toolkit and industry-specific AI frameworks.
What’s New in the AI Platform: Agents for ML Engineering, Our Deep Learning Platform, and New Capabilities for Real-Time ML
Databricks shipped Genie Code, a coding agent for ML engineering, and AI Runtime, a serverless GPU platform for deep learning. Power real-time ML at scale with new Feature Store and Model Serving capabilities, including streaming features and high-QPS serving.
How ERGO Hestia reduced time-to-market with Lakebase and Mosaic AI Model Serving
ERGO Hestia modernized its real-time pricing engine with Databricks Lakebase and Mosaic AI Model Serving, reducing time-to-market by unifying data, features, and decisions for millisecond pricing. This eliminated extraction overhead and fragmented governance from their previous multi-hop architecture, enabling faster model deployment and instant market response.
Azure OpenAI v1 API support for External Model Serving / Mosaic AI Gateway?
Databricks Model Serving: The Complete Guide to Production ML at Scale
NewsBanks' Secret Weapon Against Money Laundering: Multi-Agent AI
Databricks demonstrates a multi-agent AI solution for Anti-Money Laundering (AML) operations, significantly reducing false positives and accelerating investigation cycles from hours to minutes. The platform unifies siloed systems, employs specialized AI agents for analysis and recommendations, and offers AI-assisted SAR generation and executive-level reporting with natural language chat.
NewsDatabricks Apps vs Model Serving: Authentication, Cost, and Performance Compared
Databricks Apps are now the recommended first choice for deploying agents due to their flexibility in handling full-stack applications with multiple components, offering faster iteration and local testing compared to Model Serving. Model Serving remains suitable for use cases prioritizing high QPS, governance features like AI Gateway, inference tables, and guardrails, or when scaling to zero is acceptable for cost optimization.
MLflow 3.11.1 introduces AI-powered issue detection for agent traces, budget alerts and limits for AI Gateway spending, and a new interactive graph view for visualizing trace hierarchies. It also enhances security with pickle-free model serialization and improves dependency management with native UV support.