Skip to content
brickster.ai
Projects

Trending GitHub projects.

Repos tagged topic:databricks — open-source tools, integrations, and accelerators built on or around Databricks. Excludes the official repos already covered by Releases.

Language:
getredash/redash28.5k

redash

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

analyticsathenabibigquery
Python4.6kpushed 1w ago
cube-js/cube19.9k

cube

📊 Cube Core is open-source semantic layer for AI, BI and embedded analytics

agentic-analyticsagentsaianalytics
Rust2.0kpushed today
Tencent/APIJSON18.4k

APIJSON

🏆 Real-Time no-code, powerful and secure ORM 🚀 providing APIs and Docs without coding by Backend, and Frontend(Client) can customize response JSONs 🏆 实时 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构

baasclickhousecruddatabricks
Java2.3kpushed 1w ago
tobymao/sqlglot9.2k

sqlglot

Python SQL Parser and Transpiler

bigqueryclickhousedatabricksduckdb
Python1.1kpushed today
growthbook/growthbook7.7k

growthbook

Open Source Feature Flags, Experimentation, and Product Analytics

ab-testingabtestabtestinganalytics
TypeScript735pushed today
microsoft/SynapseML5.2k

SynapseML

Simple and Distributed Machine Learning

aiapache-sparkazurebig-data
Scala861pushed 5d ago
dotnet/spark2.1k

spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

analyticsapache-sparkazurebigdata
C#329pushed 1mo ago
Multiwoven/multiwoven1.7k

multiwoven

🔥🔥🔥 Open source Reverse ETL - alternative to hightouch and census.

bigquerycdpcustomer-data-platformdata-activation
Ruby86pushed yesterday
databricks-solutions/ai-dev-kit1.3k

ai-dev-kit

Databricks Toolkit for Coding Agents provided by Field Engineering

agentsclaudecursordatabricks
Python281pushed today
zinggAI/zingg1.2k

zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

analyticscdpcustomer-data-platformdata-science
Java161pushed yesterday
getnao/nao1.1k

nao

👾 nao is an open source analytics agent. (1) Create context with nao-core cli, (2) deploy nao chat interface for everyone

agentic-analyticsanalyticsanalytics-engineeringbigquery
TypeScript129pushed today
databricks/mlops-stacks673

mlops-stacks

This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.

databricksmachine-learningmlops
Python254pushed 3mo ago
AltimateAI/altimate-code558

altimate-code

Open-source agentic data engineering harness for dbt, SQL, and cloud warehouses. 100+ tools, 10 warehouses, AI-powered.

agentagentic-data-engineeringaianalytics-engineering
TypeScript51pushed today
databrickslabs/dbldatagen459

dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

data-generationdatabricksdatagendatageneration
Python92pushed yesterday
databrickslabs/dbx455

dbx

🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

cicicddatabricksdatabricks-api
Python129pushed 1mo ago
dataflint/spark447

spark

Drop-in replacement for Apache Spark UI

apache-sparkbig-datadata-pipelinedata-pipelines
TypeScript52pushed today
Fast-Editor/Lynkr411

Lynkr

Streamline your workflow with Lynkr, a CLI tool that acts as an HTTP proxy for efficient code interactions using Claude Code CLI.

agentsaiclaudeclaudecode
JavaScript43pushed today
databrickslabs/dqx404

dqx

Databricks framework to validate Data Quality of pySpark DataFrames and Tables

data-profilingdata-qualitydata-quality-monitoringdatabricks
Python111pushed today
DataflareApp/Dataflare383

Dataflare

Fast. Simple. Database Manager.

bigqueryclickhousecloudflare-d1cloudflare-r2
17pushed 1w ago
databricks/terraform-databricks-examples329

terraform-databricks-examples

Examples of using Terraform to deploy Databricks resources

awsazuredatabricksdatabricks-module
HCL216pushed 1mo ago
DataWithBaraa/databricks_bootcamp_2026318

databricks_bootcamp_2026

End-to-end Data Lakehouse project built on Databricks, following the Medallion Architecture (Bronze, Silver, Gold). Covers real-world data engineering and analytics workflows using Spark, PySpark, SQL, Delta Lake, and Unity Catalog. Designed for learning, portfolio building, and job interviews.

aiapache-sparkdata-analyticsdata-engineering
Jupyter Notebook160pushed 3mo ago
adidas/lakehouse-engine288

lakehouse-engine

The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.

big-dataconfiguration-drivendata-engineeringdata-quality
Python49pushed 1mo ago
databrickslabs/dlt-meta255

dlt-meta

Metadata driven Spark Declarative Pipelines framework for bronze/silver pipelines

databricksdltlakeflow-declarative-pipelinesmeta-programming
Python117pushed yesterday
databricks/databricks-sql-python227

databricks-sql-python

Databricks SQL Connector for Python

databricksdwhpython3sql
Python143pushed today
OWOX/owox-data-marts221

owox-data-marts

Open-Source Self-Service Analytics Platform

analyticsathenabigquerydashboard
TypeScript30pushed today
CartoDB/analytics-toolbox-core207

analytics-toolbox-core

A set of UDFs and Procedures to extend BigQuery, Snowflake, Redshift, Postgres and Databricks with Spatial Analytics capabilities

analytics-toolboxbigquerycartodatabricks
JavaScript43pushed today
buremba/universql202

universql

Pushdown compute from Snowflake to DuckDB running on your infrastructure

databricksdbtduckdbproxy-server
Jupyter Notebook7pushed 6mo ago
aloneguid/stowage191

stowage

Bloat-free, no BS cloud storage SDK.

aws-s3azure-storagedatabricksgcp-storage
C#20pushed 1mo ago
jrlasak/databricks-code-practice170

databricks-code-practice

Practice Databricks coding skills with hands-on exercises. Import into Databricks Free Edition, write code, run assertions, check pass/fail. Covers Delta Lake, Spark SQL, PySpark, Auto Loader, medallion architecture, window functions, and more.

auto-loadercoding-practicedata-engineeringdatabricks
Python98pushed 1w ago
lamastex/scalable-data-science168

scalable-data-science

Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.

apache-sparkdata-sciencedatabricksscala
HTML93pushed 8mo ago
databricks-solutions/databricks-apps-cookbook165

databricks-apps-cookbook

Ready-to-use code snippets for building interactive Databricks Apps.

databricksdatabricks-appsweb-application
Python109pushed 2d ago
aehrc/VariantSpark147

VariantSpark

machine learning for genomic variants

association-studiesawsbioinformaticsdatabricks
JavaScript48pushed today