Skip to content
All topics

AI Agents

Recent items mentioning AI Agents across the Databricks ecosystem — releases, news, videos, and community Q&A. Updated hourly.

60 recent items20 news33 videos7 community threads
What's happening in AI AgentsAI synthesis · updated 10h ago

Databricks has significantly advanced its AI agent capabilities, notably with the open-sourcing of Omnigent, a meta-harness for orchestrating and observing multi-harness AI agents with automatic MLflow Tracing 137. This aligns with the expanded Genie family, which now includes Genie Agents, and the Unity AI Gateway for enhanced governance 8. Databricks also highlighted the use of agentic coding for stress-testing GPU reliability and recognized VisionHeight for its agentic threat intelligence in the Built-On Databricks Startup Challenge 45.

Generated daily from the 10 most recent items mentioning AI Agents. Click any [N] to jump to the source.

HackerNews

Launch HN: Parsewise (YC P25) – Reason Across Documents with an API

Hi all, it’s Greg and Max, founders of Parsewise here (https://www.parsewise.ai/api). Parsewise transforms a bucket of unstructured data into schema compliant data, retaining lineage for values resolved across documents. Imagine giving Claude a bunch of files and asking for a CSV or JSON output. If you have tried this, you know both the system limitations (number of files, type of inputs, cost, latency) but also the human-facing challenge of having no way to validate the results quickly. We solve both. We help tech teams simplify their unstructured data ETL, and loop in business experts for the definitions and for instant validation. Here is a video with a few use cases: https://www.youtube.com/watch?v=dbRllnnh47w Parsewise in the words of someone coming to us: ”I need to extract information from insurance policy PDFs, phone calls that have been transcribed, emails, etc. I am NOT looking for something that would just extract data point by data point, page by page into a structured well-defined schema but more something more agentic that can understand that information might be across documents and that it should reason over what to extract.” We started the company based on a decade of experience (and pain) in complex data transformation and data analysis / synthesis. Greg was building both classical ETL and implemented AI workflows at Palantir. At Bain, Max did highly complex data analysis in the financial sector, similar to many of our customers. Parsewise works by taking in a bucket of data (think hundreds or thousands of pdfs, excels etc.), and outputting schema compliant data where every single value is traceable down to word level citations across multiple documents in the bucket. We provide API customers with ways to show the lineage in their own applications, or they can use our platform for internal operations. At the core of the data processing we have self-improving agent definitions. They define the acceptable sources, the logic for resolving or combining values, and the rule for highlighting uncertainty to the end user. The underlying tech is model and cloud agnostic and can be deployed in private networks. We have seen the best results with Gemini models for visual reasoning, achieving SOTA (beating Claude Fable) on the strongest grounded reasoning benchmark we have found (Databricks OfficeQA). Notably, we focused more on the “human harness” rather than the model harness, leaning into the actual friction we saw in uptake, which is around verifiability. That means optimizing the time and clicks required to trust the outcomes. We use vLLMs for parsing, and then we use small models for efficient large scale exhaustive search. Unlike RAG, we do not sample; instead, we exhaustively find all relevant values for a given query. We use larger models for decision making around resolutions and flagging inconsistencies to users. This exhaustiveness and explicit value sourcing is unique to our platform, and it goes beyond the first step of data parsing that many existing providers cover. We would love to welcome builders and tinkerers to try Parsewise on your complex document challenges. We have a ton of ideas on how we can expand the product and make it better, but would appreciate feedback and ideas from the community! --- top comments --- [whinvik] Document parsing is top of my mind lately because in some of the areas we work on the bottleneck is starting to become being able to query documents the same way one queries an api. I keep thinking the most obvious analogue is we need some way to represent documents the same way we can represent structured data in parquet. Parquet allows easy range bases queries and there is so much tooling built around Arrow. But for documents I keep hitting a wall to figure out what the right abstractions are. Parquet allows filterable metadata. But what such metadata is there for documents. Then there is the arbitrrariness of chunking, vectorization. If we could just do this in a […truncated]

5654gergelycsegzi2d ago
Databricks CommunityLearning Events

Agents at Work: Shipping Agentic Apps at Scale | Virtual Event

001w ago
Databricks CommunityGenerative AI

Accessing Document Presented in Demo in Get Started with AI Agents on Databricks Course

001w ago
Databricks CommunityMVP Articles

Databricks Introduces Omnigent: A New Meta-Harness for Building and Managing AI Agents

001w ago
Databricks CommunityGenerative AI

Does "move fast and break things" ruin AI agents?

002w ago
HackerNews

Databricks Launches LTAP: A Unified OLAP/OLTP Data Architecture

--- top comments --- [epistasis] > The New Data Foundation for the Agentic Era Look, this announcement seemed exciting, but I'm significantly less excited when I come across a completely unrelated tie-in to AI. It breaks the illusion, and I'm reminded that it's just another PR announcement, and this is probably not going to impact my life at all in any way ever. So I'm off to the next article instead of reading any more. [mohsinimam] Curious how is the final format of the data in LTAP storage - is it columnar? If so then what happens to OLTP performance - the blog and all info speaks to OLAP performance but what about your app [mathisd] > No performance tradeoffs, for any workload: Transactional workloads run in standard Postgres with full ACID semantics. Analytical workloads run across the full Lakehouse at any scale and concurrency. Each scales independently, and because there's no data movement between systems, operational and analytical results are always in sync — with no copies or shadow infrastructure. How can there be no performance trade-off if storage is handled by PostGres and there is no data movement to convert it to columnar ? This deserve a technical explanation because this seems impossible. [geophph] Lakebase + Lakehouse = Lake [drchaim] No benchmarks, no pricing, no examples..

3814thehaikuza2w ago
Databricks CommunityCommunity Articles

The Comparison: Why the Alternatives Fall Short for Databricks-Native Agentic Systems

002w ago