Skip to content
brickster.ai
Projects

Trending GitHub projects.

Repos tagged topic:databricks — open-source tools, integrations, and accelerators built on or around Databricks. Excludes the official repos already covered by Releases.

Language:
dbeaver/dbeaver50.5k

dbeaver

Free universal database tool and SQL client

aidatabasedatabricksdb2
Java4.2kpushed today
getredash/redash28.6k

redash

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

analyticsathenabibigquery
Python4.6kpushed yesterday
cube-js/cube20.2k

cube

📊 Cube Core is open-source semantic layer for AI, BI and embedded analytics

agentic-analyticsagentsaianalytics
Rust2.0kpushed today
Tencent/APIJSON18.4k

APIJSON

🏆 Real-Time no-code, powerful and secure ORM 🚀 providing APIs and Docs without coding by Backend, and Frontend(Client) can customize response JSONs 🏆 实时 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构

baasclickhousecruddatabricks
Java2.3kpushed 3d ago
tobymao/sqlglot9.3k

sqlglot

Python SQL Parser and Transpiler

bigqueryclickhousedatabricksduckdb
Python1.2kpushed today
growthbook/growthbook7.9k

growthbook

Open Source Feature Flags, Experimentation, and Product Analytics

ab-testingabtestabtestinganalytics
TypeScript763pushed today
microsoft/SynapseML5.2k

SynapseML

Simple and Distributed Machine Learning

aiapache-sparkazurebig-data
Scala861pushed 2d ago
dotnet/spark2.1k

spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

analyticsapache-sparkazurebigdata
C#329pushed 1mo ago
Multiwoven/multiwoven1.7k

multiwoven

🔥🔥🔥 Open source Reverse ETL - alternative to hightouch and census.

bigquerycdpcustomer-data-platformdata-activation
Ruby91pushed 4d ago
databricks-solutions/ai-dev-kit1.7k

ai-dev-kit

Databricks Toolkit for Coding Agents provided by Field Engineering

agentsclaudecursordatabricks
Python359pushed yesterday
getnao/nao1.3k

nao

👾 nao is an open source analytics agent. (1) Create context with nao-core cli, (2) deploy nao chat interface for everyone

agentic-analyticsanalyticsanalytics-engineeringbigquery
TypeScript173pushed 2d ago
zinggAI/zingg1.2k

zingg

Scalable master data management, identity resolution, entity resolution, and deduplication using ML

cdpcustomer-data-platformdata-sciencedatabricks
Java168pushed yesterday
databricks/mlops-stacks686

mlops-stacks

This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.

databricksmachine-learningmlops
Python256pushed 2d ago
AltimateAI/altimate-code674

altimate-code

Open-source agentic data engineering harness for dbt, SQL, and cloud warehouses. 100+ tools, 10 warehouses, AI-powered.

agentagentic-data-engineeringaianalytics-engineering
TypeScript87pushed today
DataflareApp/dataflare575

dataflare

Simple, easy-to-use database manager

bigqueryclickhousecloudflare-d1cloudflare-r2
TypeScript36pushed today
databrickslabs/dbldatagen476

dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

data-generationdatabricksdatagendatageneration
Python97pushed 2d ago
Fast-Editor/Lynkr470

Lynkr

Streamline your workflow with Lynkr, a CLI tool that acts as an HTTP proxy for efficient code interactions using Claude Code CLI.

agentsaiclaudeclaudecode
JavaScript47pushed today
dataflint/spark466

spark

Drop-in replacement for Apache Spark UI

apache-sparkbig-datadata-pipelinedata-pipelines
TypeScript55pushed 1w ago
databrickslabs/dbx462

dbx

🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

cicicddatabricksdatabricks-api
Python129pushed 2mo ago
databrickslabs/dqx422

dqx

Databricks framework to validate Data Quality of pySpark DataFrames and Tables

data-profilingdata-qualitydata-quality-monitoringdatabricks
Python120pushed today
DataflareApp/Dataflare383

Dataflare

Fast. Simple. Database Manager.

bigqueryclickhousecloudflare-d1cloudflare-r2
17pushed 1mo ago
DataWithBaraa/databricks_bootcamp_2026343

databricks_bootcamp_2026

End-to-end Data Lakehouse project built on Databricks, following the Medallion Architecture (Bronze, Silver, Gold). Covers real-world data engineering and analytics workflows using Spark, PySpark, SQL, Delta Lake, and Unity Catalog. Designed for learning, portfolio building, and job interviews.

aiapache-sparkdata-analyticsdata-engineering
Jupyter Notebook169pushed 4mo ago
databricks/terraform-databricks-examples330

terraform-databricks-examples

Examples of using Terraform to deploy Databricks resources

awsazuredatabricksdatabricks-module
HCL221pushed 2w ago
adidas/lakehouse-engine288

lakehouse-engine

The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.

big-dataconfiguration-drivendata-engineeringdata-quality
Python50pushed 1w ago
databrickslabs/dlt-meta263

dlt-meta

Metadata driven Spark Declarative Pipelines framework for bronze/silver pipelines

databricksdltlakeflow-declarative-pipelinesmeta-programming
Python127pushed today
databricks/databricks-sql-python231

databricks-sql-python

Databricks SQL Connector for Python

databricksdwhpython3sql
Python147pushed 3d ago
OWOX/owox-data-marts222

owox-data-marts

Open-Source Self-Service Analytics Platform

analyticsathenabigquerydashboard
TypeScript32pushed yesterday
CartoDB/analytics-toolbox-core210

analytics-toolbox-core

A set of UDFs and Procedures to extend BigQuery, Snowflake, Redshift, Postgres and Databricks with Spatial Analytics capabilities

analytics-toolboxbigquerycartodatabricks
JavaScript44pushed 2d ago
jrlasak/databricks-code-practice209

databricks-code-practice

Practice Databricks coding skills with hands-on exercises. Import into Databricks Free Edition, write code, run assertions, check pass/fail. Covers Delta Lake, Spark SQL, PySpark, Auto Loader, medallion architecture, window functions, and more.

auto-loadercoding-practicedata-engineeringdatabricks
Python118pushed 1w ago
buremba/universql205

universql

Pushdown compute from Snowflake to DuckDB running on your infrastructure

databricksdbtduckdbproxy-server
Jupyter Notebook7pushed 7mo ago
aloneguid/stowage190

stowage

Bloat-free, no BS cloud storage SDK.

aws-s3azure-storagedatabricksgcp-storage
C#22pushed 3mo ago
databricks-solutions/databricks-apps-cookbook175

databricks-apps-cookbook

Ready-to-use code snippets for building interactive Databricks Apps.

databricksdatabricks-appsweb-application
Python115pushed 1mo ago
lamastex/scalable-data-science168

scalable-data-science

Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.

apache-sparkdata-sciencedatabricksscala
HTML93pushed 9mo ago
aehrc/VariantSpark147

VariantSpark

machine learning for genomic variants

association-studiesawsbioinformaticsdatabricks
JavaScript48pushed 4d ago