Road to a Robust Data Lake: Utilizing Delta Lake & Databricks to Map 150 Million Miles of Roads
Description
In the past, stream processing over data lakes required a lot of development efforts from data engineering teams, as Itai has shown in his talk at Spark+AI Summit 2019 (https://tinyurl.com/2s3az5td). Today, with Delta Lake and Databricks Auto Loader, this becomes a few minutes' work! Not only that, it unlocks a new set of ways to efficiently leverage your data. Nexar, a leading provider of dynamic mapping solutions, utilizes Delta Lake and advanced features such as Auto Loader to map 150 million miles of roads a month and provide meaningful insights to cities, mobility companies, driving apps, and insurers. Nexar’s growing dataset contains trillions of images that are used to build and maintain a digital twin of the world. Nexar uses state-of-the-art technologies to detect road furniture (like road signs and traffic lights), surface markings, and road works. In this talk, we will describe how you can efficiently ingest, process, and maintain a robust Data Lake, whether you’re a mapping solutions provider, a media measurement company, or a social media network. Topics include: * Incremental & efficient streaming over cloud storage such as S3 * Storage optimizations using Delta Lak…
Description from YouTube. Full content on the video page.
More from Databricks
NewsGovern MCP servers in Databricks #databricks #mcp #aigovernance
Databricks Unity AI Gateway now governs MCP servers, centralizing their management alongside built-in foundation models and LLMs. This integration allows for easier governance and orchestration of various AI components and agents within Databricks.
NewsHow Suntory Turns Data into Faster Decisions with Databricks
Suntory uses Databricks to integrate diverse datasets, including internal sales, macroeconomic factors, and consumer behavior, into "Project Brain" for faster decision-making and product launches. The company also implements an all-employee upskilling program, "Manabi no Michi," to empower its workforce to leverage AI for improved performance and efficiency.
NewsAIA Group x Databricks: Turning Regulated Data into Real-Time Intelligence
AIA Group leverages Databricks to manage regulated data across 18 markets, addressing challenges like data residency and varying tech maturity with features like Unity Catalog for governance. The platform enables real-time intelligence for investment decisions, fraud detection, and personalized agent coaching, with future plans for conversational analytics and autonomous AI.
TutorialsConnect Google Sheets to Databricks
The Databricks Google Sheets add-in allows users to explore, import, and refresh governed data from the Databricks Lakehouse directly within Google Sheets. It demonstrates how to browse Unity Catalog, select tables or metric views, apply filters, schedule data refreshes, and use direct SQL queries with parameters.
NewsNo More Table Locks for Multi Statement Transactions #databricks #dataengineering #sql
Databricks now supports multi-table transactions, allowing changes to multiple tables within a single atomic transaction that rolls back all changes if any part fails. This feature, managed by Unity Catalog, prevents table locking during updates and supports up to 100 tables per transaction using a simple "BEGIN ATOMIC...END" syntax.
NewsMay 2026 Databricks Updates: No Code ETL, New GPUs and Death of the Dashboard
Databricks announced several updates including AI Prep Search for document chunking and vector database preparation, SQL vector functions for embedding mathematics, and the general availability of multi-table transactions. They also introduced Lakeflow Designer for visual, no-code data pipeline creation and updated their serverless GPU offerings to include H100s.