01AI Search. The rename you can ignore, and the keyword + hybrid search you can't.
02This week, in brief. Power Platform GA, query-history PII redaction, federation comment sync, and Omnigent hits Beta.
03From Brickster.ai: We just crossed 1,000 monthly active users. Thank you.
01
🔎 AI Search
RETRIEVAL
Turns out, not everything needs to be a vector.
Vector Search is now AI Search, and it finally does keyword and hybrid retrieval natively, no embeddings required.
In a quiet release note dated June 1, two weeks before Summit, Databricks slipped in a change that's bigger than its low-key rollout suggests: Databricks Vector Search has been renamed to Databricks AI Search, and alongside the rename you can now create a full-text search index with no embeddings at all.
The rename is cosmetic. The capability is not. For two years, every team building RAG and agent retrieval on Databricks has run into the same wall: pure vector similarity is great at "what is this about" and terrible at "find me the exact SKU, error code, or customer ID." The standard workaround was to bolt on a separate search engine (OpenSearch, Elasticsearch) and stitch the results back together yourself.
AI Search collapses that workaround into the platform. Hybrid indexes combine semantic and keyword search on the same index; Full-Text indexes drop embeddings entirely for pure keyword retrieval, all governed by Unity Catalog, all on infrastructure you already pay for.
🔥 The problem it solves
The "semantic search can't find an order number" problem
Vector search is built on similarity. Ask it for documents about a refund policy and it shines. Ask it to retrieve the chunk containing ORD-99-A14, a part number, or a person's exact name, and it routinely misses, because identifiers carry almost no semantic signal. The embedding for "ORD-99-A14" looks a lot like the embedding for "ORD-99-A15."
In practice, the most reliable retrieval pipelines have always been hybrid: run keyword (BM25-style) and semantic search in parallel, then merge the rankings. That's well-established as a best practice, but on Databricks it used to mean operating a second search system outside Unity Catalog, with its own sync jobs, its own access model, and its own bill.
AI Search's core move: keyword and hybrid retrieval become first-class capabilities, synced from your Delta tables and governed exactly like the rest of your lakehouse.
⚙️ Three ways to retrieve, one product
Type 1 · Semantic
🧠 Vector: "what is this about"
The original capability, unchanged. Embeddings power semantic similarity for conceptual, fuzzy, natural-language queries. Still the right default when meaning matters more than exact terms.
Type 2 · Best of both
🔀 Hybrid: semantic + keyword on one index
A search mode on a vector index that runs both vector and keyword retrieval and merges the results for you (reciprocal rank fusion). This is what most production RAG and agent systems should reach for by default: concept matching and exact-term recall, without running two systems.
Type 3 · New · Beta
🔤 Full-Text: keyword-only, zero embeddings
A dedicated full-text search index is a Delta Sync Index created without any embedding columns: pure keyword search for exact terms, identifiers, and codes. The practical implications:
No embedding model, no embedding cost: nothing to generate, host, or re-embed when data changes
Lower latency & spend for workloads that never needed semantics in the first place
Available on storage-optimized endpoints using triggered sync mode
🧭 Which index should you pick?
Pick Vector
Q&A over docs, conceptual lookup, "find me similar," natural-language intent.
Pick Hybrid
Production RAG & agents, mixed queries, support search, anything customer-facing. The safe default.
Pick Full-Text
Log/code/ID lookup, exact-match filters, cost-sensitive keyword search with no semantic need.
3
Retrieval modes
0
Embeddings Needed
Beta
Full-Text Status
🔭 What this means for your retrieval stack
If you're running RAG or agents on Databricks, three things are worth doing this week:
1. Audit your "missing chunk" complaints. If users report the assistant can't find exact IDs, codes, or names, that's a keyword problem a hybrid index likely fixes, no prompt engineering required.
2. Reconsider your external search engine. If a separate OpenSearch/Elasticsearch cluster exists only to add keyword recall, AI Search may let you retire it, and pull retrieval back under Unity Catalog governance.
3. Plan the SDK migration. Existing Vector Search indexes and queries keep working, but the Python package was renamed from databricks.vector_search to databricks-ai-search. The old name still imports as a deprecated shim that warns on import, so pin the new package and plan the move before the shim goes away.
The first full week after Data + AI Summit 2026 brought a steady stream of GA promotions and governance tweaks. The four worth knowing:
AGENTS
Omnigent graduates to Beta
The coding-agent meta-harness we covered in Issue #6 moved from Alpha to Beta, wrapping Claude Code, Codex and others with shared sessions, contextual policies, mobile access, and deployment infra.
GOVERNANCE
Query history now redacts SQL text by default
Rolling out from June 22, the statement_text column in system.query.history returns <Redacted> for anyone who isn't an account admin or in the new databricks_pii_access group. Check your monitoring jobs.
INTEGRATION
Power Platform connector is now GA
The Databricks connector for Microsoft Power Platform reached general availability. Power Apps and Power Automate can hit governed lakehouse data without a custom middle tier.
UNITY CATALOG
Federation syncs foreign-table comments (Beta)
Lakehouse Federation can now pull table and column comments from supported external sources into foreign tables in Unity Catalog, for better discovery and lineage of data you don't physically own.
03
🧱 From Brickster.ai
One thing from our side this week: brickster.ai just crossed 1,000 monthly active users, across more than 40 countries. Thank you for reading, sharing, and telling me what to build next.
🤖 Migrating to AI Search?
The Brickster Assistant searches our full archive and answers with citations.