Cost Optimization
Recent items mentioning Cost Optimization across the Databricks ecosystem — releases, news, videos, and community Q&A. Updated hourly.
Recent discussions highlight significant cost savings when migrating to or optimizing on Databricks. One user reported a 77% cost reduction per run after migrating from Snowflake to Databricks in 12 weeks 1, while Octopus Energy achieved a 50x cost reduction in their data engineering pipelines by leveraging Databricks Serverless and Delta Lake Change Data Feed 2. For those looking to optimize, two recent ebooks offer guidance on Databricks cost optimization 34.
Generated daily from the 4 most recent items mentioning Cost Optimization. Click any [N] to jump to the source.
Snowflake to Databricks Migration in 12 weeks and cut cost per run by ~77%. AMA.
Lovelytics wrapped up a Snowflake-to-Databricks migration; 847 DBT models, 35 Info Mart tables, \~77% lower cost per run on a 2XL warehouse. **TL;DR What helped:** * Treated the migration as engineering, not translation. Each dbt model was tested in isolation, not just row counts vs Snowflake. * Routing macro to resolve cross-layer references at runtime, so the same codebase could read from Snowflake, federated Snowflake, and Unity Catalog without forking logic. * Dual model trees in one repo, which let the migration stay in lockstep with live Snowflake changes. * Script-generated wave selectors enabled parallel builds while preserving dependency order. * Used reference-slice validation subsets vs. waiting on full mart refreshes. **TL;DR Cost reduction:** * Reworked joins to use narrow staging dimensions instead of wide marts where possible. * Added incremental predicates to reduce MERGE target scans. * Split wide models into parallel sub-models where the dependency graph allowed it. * Copied static reference data into Delta instead of repeatedly reading it through federation. * Loaded static copies into Delta rather than reading via federation (predicate pushdown is poor). Happy to go into the gotchas: HASH() not being portable, Snowflake MERGE tolerating duplicate keys that Delta doesn't, NULL ordering, and timestamp handling. AMA [Full Blog Post](https://community.databricks.com/t5/technical-blog/partner-blog-847-models-12-weeks-77-less-inside-r1-s-snowflake/ba-p/157284)
Scaling for MHHS: how Octopus Energy achieved a 50x cost reduction in margin data engineering
Octopus Energy achieved a 50x cost reduction in their margin data engineering pipelines by re-architecting on Databricks for UK MHHS regulation. They leveraged Delta Lake Change Data Feed and Databricks Serverless to process 48x more data at a fraction of the original cost, improving freshness from weekly to daily.
NewsDatabricks Apps vs Model Serving: Authentication, Cost, and Performance Compared
Databricks Apps are now the recommended first choice for deploying agents due to their flexibility in handling full-stack applications with multiple components, offering faster iteration and local testing compared to Model Serving. Model Serving remains suitable for use cases prioritizing high QPS, governance features like AI Gateway, inference tables, and guardrails, or when scaling to zero is acceptable for cost optimization.
NewsGPU Accelerated Spark Connect
This video demonstrates how to accelerate Spark Connect using GPUs for both Spark SQL and ML workloads. It details the architecture, deployment, and benchmark results showing significant speedups and cost savings compared to CPU-only execution.


