Introducing DBRX Open LLM - Data Engineering San Diego (May 2024)
Description
A special event presented by Data Engineering San Diego, Databricks User Group, and San Diego Software Engineers. Presentation: Introducing DBRX - Open LLM by Databricks By: Vitaliy Chiley, Head of LLM Pretraining for Mosaic at Databricks DBRX is an open-source LLM by Databricks which when recently released outperformed established open-source models on a set of standard benchmarks. Join us to learn firsthand about how the Mosaic Research team built DBRX and why it matters. This talk will cover the architecture, model evaluation, and how you can try it out.
Description from YouTube. Full content on the video page.
More from Dustin Vannoy
TutorialsDatabricks AI Dev Kit: Install for Copilot + VS Code
The video demonstrates how to install the Databricks AI Dev Kit for Visual Studio Code with GitHub Copilot on Windows, guiding users through the installation script, profile configuration, and skill selection. It then shows how to enable the Databricks tools in Copilot chat and tests its functionality by generating code and executing SQL queries against a Databricks workspace.
TutorialsDatabricks AI Dev Kit Demo - Install, DataGen, SDP, Dashboard
The video demonstrates installing the Databricks AI Dev Kit on a Mac, then uses it to generate synthetic data, create serverless Spark declarative pipelines for a medallion architecture, and build a Databricks dashboard based on the generated data. It highlights how the AI Dev Kit leverages skills and an MCP server to automate these development tasks.
ReleasesIntroducing Databricks AI Dev Kit - Skills, MCP server, Builder App
The Databricks AI Dev Kit provides agent skills, an MCP server, and a Builder App to enhance AI-driven development on Databricks. It allows users to integrate AI coding tools with Databricks best practices, extending LLM capabilities through specialized functions and offering a chat-based interface for building applications.
NewsAI-Driven Development
AI-driven development is a workflow where AI is the primary engine for generating, validating, and maintaining code, shifting the developer's role to directing the AI. Key concepts include the context window (the amount of text an AI model can consider), tokens (processing units for text), and tool use (AI invoking external functions).
NewsClaude Code: 5 Essentials for Data Engineering
The video introduces five essential concepts for using Claude Code in data engineering: the cloud.mmd file for core project information, skills for packaging expertise, commands for predefined prompts, sub-agents for focused tasks, and Model Context Protocol (MCP) for standardized tool interaction. These components help manage context and memory for effective AI-enhanced development.
TutorialsDatabricks + Cursor IDE: Step-by-Step AI Coding Tutorial
The video demonstrates using Cursor IDE for AI-enhanced Databricks development, focusing on setting up Databricks Connect and leveraging Cursor rules and context for efficient code generation and testing. It shows how to structure projects, write Python and PySpark code, and create unit tests, highlighting the importance of providing clear instructions to the AI agent.