Structured Streaming
Recent items mentioning Structured Streaming across the Databricks ecosystem — releases, news, videos, and community Q&A. Updated hourly.
Recent developments highlight the growing maturity of Spark's Structured Streaming Real-Time Mode (RTM), demonstrating significant latency reductions. RTM achieves P50 and P95 latencies of 26ms and 50ms respectively in simplified setups 2, and enables sub-second end-to-end latency for complex applications like real-time air traffic control 3. The community is actively exploring RTM for millisecond streaming pipelines with Kafka 1.
Generated daily from the 3 most recent items mentioning Structured Streaming. Click any [N] to jump to the source.
Delta Lake 4.3.0
Databricks practitioners can now integrate Spark with the Unity Catalog Delta REST API for managed Delta tables and selectively replace data using new `replaceOn` and `replaceUsing` DataFrame APIs. UniForm for Iceberg conversion is now atomic and incremental, and Delta Sharing supports streaming and Change Data Feed for shared tables.
Building a Spark Streaming Real-Time Mode (RTM) Pipeline — Millisecond Streaming with Kafka
I recently built a fully working real-time transaction enrichment pipeline using PySpark RTM paired with Kafka, achieving end-to-end latency in the milliseconds. The article covers: \- Real-Time Mode (RTM) fundamentals \- Kafka integration with Spark Structured Streaming \- Millisecond-latency pipeline architecture \- Real-time transaction enrichment patterns Blog: https://blog.devgenius.io/building-a-spark-streaming-real-time-mode-rtm-pipeline-millisecond-streaming-with-kafka-dda74e9ef284
TutorialsApache Spark Streaming Real-Time Mode - Latency Demo
The video demonstrates how to deploy and run Apache Spark Streaming in Real-Time Mode (RTM) using a declarative automation bundle. It shows that RTM significantly reduces P50 and P95 latencies compared to microbatch mode, achieving 26ms and 50ms respectively in a simplified setup without an external messaging bus.
TutorialsAir Traffic Control with Apache Spark Structured Streaming Real-Time Mode
The video demonstrates building a real-time air traffic control application using Apache Spark Structured Streaming Real-Time Mode, Lakehouse, and Databricks Apps. This system processes live flight telemetry, detects congestion, and generates alerts with sub-second end-to-end latency, all within a single Databricks platform.
TutorialsUnlock Your Use Cases: A Deep Dive on Structured Streaming’s New TransformWithState API
NewsCrypto at Scale: Building a High-Performance Platform for Real-Time Blockchain Data
NewsSupercharging Sales Intelligence: Processing Billions of Events via Structured Streaming
TutorialsReal-Time Mode Technical Deep Dive: How We Built Sub-300 Millisecond Streaming Into Apache Spark™
ReleasesIntroducing Simplified State Tracking in Apache Spark™ Structured Streaming
ReleasesNebula: The Journey of Scaling Instacart’s Data Pipelines with Apache Spark™ and Lakehouse
NewsUS Army Corp of Engineers Enhanced Commerce & National Sec Through Data-Driven Geospatial Insight
NewsHigh Volume Intelligent Streaming with Sub-Minute SLA for Near Real-Time Data Replication
NewsHow We Made a Unified Talent Solution Using Databricks Machine Learning, Fine-Tuned LLM & Dolly 2.0
Community












