Skip to content
brickster.ai
All topics

Delta Sharing

Recent items mentioning Delta Sharing across the Databricks ecosystem — releases, news, videos, and community Q&A. Updated hourly.

37 recent items8 releases2 news25 videos2 community threads
What's happening in Delta SharingAI synthesis · updated 2d ago

Delta Sharing continues to expand its reach, with Stripe data now available on Databricks Marketplace, enabling instant activation of Stripe data pipelines for AI applications 4. Furthermore, SAP Business Data Cloud automatically syncs semantic metadata and governance tags into Unity Catalog via Delta Sharing, making SAP data AI-ready and enhancing discoverability and access control 3. While not explicitly detailed, Delta Sharing also appears to be a notable topic on the Databricks Associate (DEA) exam 12.

Generated daily from the 4 most recent items mentioning Delta Sharing. Click any [N] to jump to the source.

RedditGeneral

[Passed] Databricks DEA Exam today

https://preview.redd.it/z6mcmrgvmjyg1.png?width=474&format=png&auto=webp&s=28e010f62635d49af3a815998011125d8f2cfa0f Just walked out of the exam and I’m glad to say I passed. I was sweating a bit because the exam content changes on the 4th, so I really didn't want to fail and have to deal with a new syllabus. I've had Databricks at work since late 2023. I’ve been using it because, well, it’s there, but I was mostly just "vibe coding"—picking up some Python and Spark here and there without any real depth. I ran jobs using whatever cluster settings the company gave me without actually knowing what they meant. If you’ve never touched Databricks, this exam is going to be a pain. Even if you’re good at coding, the internal components and the way everything fits together are hard to grasp just by reading. You really need to get your hands dirty in the workspace to get a "feel" for it. **Study Routine** I started with the Databricks Academy stuff, but since I’m juggling work and a toddler, I could only study on weekends. This was a disaster because by the next Saturday, I’d already forgotten what I learned the week before. One month before the exam, I ditched the theory and just hammered Mock Exams. * Udemy is your friend: I bought practice exams from Derar and Santosh. * I snagged them at discounted price. Just wait for the sale if you are not in a hurry. Personally, Santosh’s exams felt closer to the real thing. I saw maybe 5-6 questions that were almost word-for-word. Derar is also solid; honestly, just solve as many problems as possible. Since my study time was limited, I focused on reviewing the questions I got wrong. I realized pretty early that Productionizing Data Pipelines was my weak spot. I didn't try to become an expert in it. I just aimed for a 60% "pass" in that section and doubled down on the areas I was actually good at. Don't completely ignore your weak areas though. If you bomb one section too hard, a couple of silly mistakes in other sections will kill your score. **What's on the exam** The questions are mostly scenario-based. You have to read the prompts carefully. Some things I remember: * Autoloader: This came up a lot. * DLT (now called Lakeflow Spark Declarative Pipelines): should understand what it actually does * Unity Catalog: Permissions (Granting minimum access) and the actual SQL code for it. * Delta Sharing: Knowing the difference between sharing with Databricks vs. non-Databricks users. * Egress Costs: How to avoid them in cross-cloud sharing (Cloudflare R2 was the answer for one). * SQL Warehouses: Classic vs. Pro vs. Serverless. Know when to use which. * DABs (Databricks Asset Bundles): I got at least 3 questions on this. Don't skip it. * Medallion Architecture: It’s not just "what is Bronze/Silver/Gold." They’ll give you a scenario and ask which layer the data should go to next. Also, those "select two" questions are the absolute worst, super confusing. I know the syllabus is changing on the 4th, so I’m not sure how much of this will still apply. But honestly, if you have some background and get familiar with the core concepts, it’s a very doable exam. I’ve learned a lot through this process. Good luck to everyone preparing!

64Significant_Pace3612w ago
RedditDiscussion

Here are 5 topics that showed up much more than I expected in my DEA exam

I took the Databricks Data Engineer Associate exam recently and wanted to share what actually came up because it was quite different from what I spent most of my time studying. I went in thinking Delta Lake theory and platform architecture would be the big topics. They weren't. The exam is way more practical than I expected. **The first thing** that caught me off guard was how heavily they test Auto Loader. Not just the basics but real scenarios. One question described a pipeline receiving 50,000 new files per day and asked which ingestion method to use and why. You need to understand when Auto Loader makes sense versus COPY INTO, how schema evolution works with mergeSchema, and the difference between directory listing and file notification mode. I probably got six or seven questions just on this one topic. **The second thing** was lazy evaluation. I knew the concept but I wasn't prepared for how they test it. They give you a block of code with four or five DataFrame transformations and ask what happens when you run the cell. The answer is nothing happens because there is no action at the end. But the way they frame the questions makes you second guess yourself if you only memorized the definition without really understanding it. **Third** was Lakeflow expectations. The old name was Delta Live Tables but they use Lakeflow in the exam now. You need to know the three expectation types and when to use each one. They gave me a scenario where the pipeline should log bad records but never drop them and I had to pick the right expectation decorator. Also know the difference between streaming tables and materialized views because that came up more than once. **Fourth** was Unity Catalog permissions. Not just the three level naming pattern but actual grant scenarios. Something like a data analyst needs to read tables in the sales schema but should not be able to create new tables and you have to pick the correct grant statement. I got at least three or four questions like this. **Fifth** was MERGE INTO. They really love this command. Upsert scenarios, deduplication, slowly changing dimensions. If you cannot write a MERGE statement from memory with the WHEN MATCHED and WHEN NOT MATCHED clauses you should spend an hour practicing just that before you sit for the exam. What surprised me about what was not heavily tested. Cluster configuration was maybe one question. The architecture diagrams with control plane and data plane were one or two questions at most. Delta Sharing was one question. Spark internals like shuffle details were barely mentioned. The biggest thing I wish I had done differently is spend less time reading documentation and more time actually running code. When you have actually executed a MERGE INTO on a real table and seen the results, the exam question feels like something you have done before instead of something you read about once. I used Databricks Free Edition for all my practice and it was more than enough. Hope this helps someone who is preparing right now. Feel free to ask anything about the exam in the comments and I will try to answer.

318InevitableClassic2612w ago