Databricks SDK
Recent items mentioning Databricks SDK across the Databricks ecosystem — releases, news, videos, and community Q&A. Updated hourly.
The Databricks SDK for Go has been updated to v0.134.0, adding support for new fields in job pipeline refresh selections (full, flow, reset checkpoint) and operational email custom recipients 1. Community discussions on Reddit also touched upon the Databricks SDK 2 and a recovery script for Lakeflow SDP pipelines encountering DIFFERENT_DELTA_TABLE_READ_BY_STREAMING_SOURCE errors 3.
Generated daily from the 3 most recent items mentioning Databricks SDK. Click any [N] to jump to the source.
Databricks SDK for Go now supports new fields for configuring pipeline refresh selections within jobs, including full, flow, and reset checkpoint options. Additionally, new fields for operational email custom recipients have been added to settings.
This release introduces new API methods for managing workspace assignments at both account and workspace levels. It also adds support for Python operator tasks and disabling tasks in Databricks Jobs, along with new fields for ML features and pipeline connector options.
The `vector_search_endpoints` configuration and commands now use `target_qps` instead of `min_qps`, requiring updates to `databricks.yml` and CLI invocations. Authentication commands received several usability improvements, including better keyring handling, clearer profile selection, and improved host input.
Databricks-sdk
When using databricks-sdk in python. For the first time using the workspace client it gives a pop up for user login , but for the subsequent runs it gives authentication succeeded even after deleting the local cache tokens (.databricks/token-cache.json) Any insights , for how long the authentication stays or from where else we can delete the cache credentials , so the pop up can rework again?
This release adds support for Confluence, Meta Marketing, Jira, and Zendesk as ingestion sources for pipelines, along with new Vector Search endpoint scaling options and permission management. Several fields in SupervisorAgent, Knowledge Assistant Examples, and Tools are no longer required, and Vector Search endpoint `MinQps` fields have been removed.
If your Lakeflow SDP pipeline broke with DIFFERENT_DELTA_TABLE_READ_BY_STREAMING_SOURCE, here's a recovery script
I ran into this recently and wanted to share. A Delta table I was streaming from got dropped and recreated by an upstream team. Same name, same schema, but the new table has a fresh internal ID. Spark Structured Streaming checkpoints bind to that ID, so the next pipeline run error with: `[DIFFERENT_DELTA_TABLE_READ_BY_STREAMING_SOURCE] The streaming query was reading from an unexpected Delta table...` In open-source Spark you'd delete the checkpoint directory. Lakeflow SDP manages those paths internally, so that's not an option. The fix is the Pipelines API parameter `reset_checkpoint_selection` (added in `databricks-sdk` 0.100): pass a list of FQN flow names and start an update that clears only those checkpoints. Bronze/Silver/Gold targets stay untouched. I packaged the recovery as a sub-template in my Databricks bundle template repo. One CLI call ships the script (with a `--dry-run` flag), a workspace notebook variant, and a README: `databricks bundle init https://github.com/vmariiechko/databricks-bundle-template --template-dir assets/sdp-checkpoint-recovery` It also includes a fallback for environments where you can't pip-upgrade the SDK (for me it was the case when using the Databricks serverless runtime, which bundles its own SDK). Repo: https://github.com/vmariiechko/databricks-bundle-template/tree/main/assets/sdp-checkpoint-recovery Two gotchas worth knowing: - Flow names must be three-part Unity Catalog FQNs (`catalog.schema.table`), or you hit `IllegalArgumentException`. - Resetting checkpoints triggers a pipeline update; the API has no "reset only" mode. If you want the pipeline stopped after, cancel from the UI as soon as the call returns. Happy to answer questions or hear how you have handled this situation. P.S. Feel free to submit issues or PRs.
This release introduces a new disaster recovery package and adds methods for managing knowledge assistant examples. It also includes breaking changes related to the `supervisoragents.Connection` and `supervisoragents.Tool` fields.
Unified host detection is now automatic, removing the `Experimental_IsUnifiedHost` field and enabling a single configuration profile for both account and workspace operations. The file-based OAuth token cache has been removed, defaulting to an in-memory cache, and new API methods were added for Temporary Volume Credentials and Knowledge Assistants.
The SDK now automatically detects AI coding agents and appends agent information to HTTP request headers, while also removing the unused `experimentalIsUnifiedHost` field from `DatabricksConfig`. A bug fix addresses `X-Databricks-Org-Id` header issues for `SharesExtImpl.list()` on SPOG hosts, and several API changes introduce new services like `secretsUc()` and `supervisorAgents()`, along with breaking changes to method paths for various update operations.
The CLI now supports a --limit flag for paginated list commands and caches host metadata lookups for faster repeated invocations. Bundles gain support for Vector Search Endpoints and prompt before destroying Lakebase resources.
This release adds new workspace-level services for supervisor agents and Unity Catalog secrets, along with an update method for tokens. Several API methods for data classification, environments, knowledge assistants, Postgres, and warehouses have breaking changes due to path modifications.
This release drops support for Python 3.8 and 3.9, requiring Python 3.10 or newer. It introduces automatic unified host detection for account and workspace operations, along with new API methods for catalog, Postgres, apps, Genie, pipelines, and Vector Search services.
This release requires Go 1.24 or higher and introduces new features like a host metadata resolver hook and a limit iterator for lazy iteration. It also includes numerous bug fixes for token acquisition and caching issues across various authentication methods, alongside several API additions and some breaking changes.
The SDK now automatically detects AI coding agents and appends `agent/<name>` to HTTP request headers. New `DisableGovTagCreation` fields were added to `settings.RestrictWorkspaceAdminsMessage` and `settingsv2.RestrictWorkspaceAdminsMessage`.
OAuth token refreshing is now proactive for tokens expiring within five minutes, improving token validity for callers. The `dashboards.GenieSpace` struct gains a new `ParentPath` field.
This release introduces a new `environments` service for managing Databricks environments programmatically. It also adds a `parent_path` field to GenieSpace dashboards and a `can_create_app` permission level.
The Databricks SDK for Go now supports specifying a `default_profile` within the `[__settings__]` section of your `.databrickscfg` file. This allows for easier configuration management when working with multiple profiles.
Release: v2.10.6 (#1858)
This release fixes an issue where the VS Code extension would not correctly handle 404 errors from the Databricks SDK. This improves stability when interacting with Databricks resources that might not exist.
This release adds new fields for job alert configurations across various job-related structs, including `RunOutput`, `RunTask`, `SubmitTask`, and `Task`. It also introduces a new `environments` package and service for managing workspace environments, alongside a `CanCreateApp` permission level for IAM.
This release adds new fields and methods across several services, including updates for ML features, pipelines, and Postgres roles. It also introduces breaking changes by making previously required fields optional in ML-related DeltaTableSource, Feature, Function, and KafkaSource configurations.
Databricks SDK for Python now supports new fields for defining ingestion pipelines, including connector type, data staging options, and detailed ingestion source information. External function requests in model serving can now specify a sub-domain.
This release introduces new methods for the Genie workspace service and an update_role method for the Postgres service. Several fields across ML and other services are now optional, including breaking changes for `entity_columns`, `inputs`, `function_type`, `entity_column_identifiers`, and `timeseries_column_identifier` in various ML-related services.
Databricks SDK for Java now allows fine-grained control over HTTP request timeouts through a new `withRequestConfig` method on `CommonsHttpClient.Builder`. This enables practitioners to configure specific timeout settings for their API calls.
Databricks CLI authentication now correctly errors on token scope mismatches, prompting re-authentication instead of silently using incorrect permissions. New `dataclassification` and `knowledgeassistants` services and corresponding workspace-level APIs have been added.
v.3.9.0
MLflow 3.9.0 introduces an in-product MLflow Assistant chatbot and a Trace Overview Dashboard for GenAI experiments, enhancing debugging and performance insights. The AI Gateway is revamped for direct tracking server integration, alongside new LLM judge features for online monitoring and custom prompt building.
The workflow assessment functionality now includes an experimental task that analyzes recently executed workflows for migration problems, providing recommendations and documentation links. UCX documentation has been significantly enhanced with a revamped main page, a new Getting Started section, and updated contribution guidelines.
UCX assessment reports will no longer include UCX jobs, and the `JobsCrawler` now correctly handles integer job IDs. Dashboard names can now include special characters like spaces and brackets again.
UCX now supports Databricks Runtime 16+ for Hive Metastore table conversions and introduces a new `query_statement_disposition` option for SQL backend exports to handle large workspaces. Pipeline and dashboard migration workflows have been enhanced with new filtering options and improved progress tracking, including daily scheduled migration progress updates.