Fixes #23568: Implement Configurable DB Query Retry Mechanism#27733
Fixes #23568: Implement Configurable DB Query Retry Mechanism#27733mohitjeswani01 wants to merge 5 commits intoopen-metadata:mainfrom
Conversation
…ilures (open-metadata#23568) Added @db_retry decorator with configurable exponential backoff and SQLSTATE detection. Applied to base classes (CommonDbSource, SqlColumnHandlerMixin) and dialect-specific overrides (Postgres, Greenplum).
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Implements a centralized, configurable retry mechanism for transient database query failures during ingestion to reduce incomplete metadata capture due to statement timeouts and similar transient DB errors.
Changes:
- Added
queryRetryConfigto the database ingestion pipeline JSON schema. - Introduced
metadata.utils.db_retrywith transient error detection + exponential backoff w/ jitter. - Applied
@db_retryto selected idempotent query methods across common and Postgres/Greenplum sources, and added unit tests.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| openmetadata-spec/src/main/resources/json/schema/metadataIngestion/databaseServiceMetadataPipeline.json | Adds queryRetryConfig schema definition and exposes it on the pipeline config. |
| ingestion/src/metadata/utils/db_retry.py | New retry decorator implementation (transient detection + backoff/jitter + logging). |
| ingestion/src/metadata/ingestion/source/database/sql_column_handler.py | Wraps column fetch/sample query helpers with @db_retry. |
| ingestion/src/metadata/ingestion/source/database/common_db_source.py | Applies @db_retry to key schema/table metadata queries. |
| ingestion/src/metadata/ingestion/source/database/multi_db_source.py | Adds retry to the raw DB-name query executor. |
| ingestion/src/metadata/ingestion/source/database/postgres/metadata.py | Adds retry to Postgres-specific metadata SELECT queries. |
| ingestion/src/metadata/ingestion/source/database/greenplum/metadata.py | Adds retry to Greenplum-specific metadata SELECT queries. |
| ingestion/tests/unit/utils/test_db_retry.py | Adds unit tests covering transient detection, config extraction, and backoff behavior. |
…onfig clamping - Add inspect.isgeneratorfunction() to handle generator methods (gitar-bot) - Materialize generators via list() inside retry loop to catch mid-iteration errors - Add _sanitize_exc() to prevent SQL/parameter leaks in logs (Copilot) - Add _normalize_config() to clamp invalid retry values (Copilot) - Swap decorator order on get_schema_definition (Copilot) - Remove @db_retry from non-idempotent _get_stored_procedures_internal (gitar-bot) - Fix log wording from 'attempt' to 'retry' (Copilot) - Commit generated Pydantic models from make generate (gitar-bot) - 50 unit tests passing (up from 34)
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
- Emit logger.warning when queryRetryConfig values are clamped to safe ranges (gitar-bot)
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
Code Review 🚫 Blocked 3 resolved / 4 findingsImplement configurable DB query retry mechanism with fixes for decorator generator compatibility, non-idempotent procedure handling, and silent configuration clamping. The feature remains inoperable because the generated Python models in databaseServiceMetadataPipeline are not updated. 🚨 Bug: Generated Python models not updated — feature is inoperable📄 openmetadata-spec/src/main/resources/json/schema/metadataIngestion/databaseServiceMetadataPipeline.json:44-58 📄 openmetadata-spec/src/main/resources/json/schema/metadataIngestion/databaseServiceMetadataPipeline.json:222-226 📄 ingestion/src/metadata/utils/db_retry.py:77-91 The JSON schema in Since Per the project's schema-first development guidelines: changes must originate in ✅ 3 resolved✅ Bug: db_retry decorator is broken for generator functions
✅ Bug: @db_retry applied to non-idempotent stored procedure query
✅ Quality: _normalize_config silently clamps invalid values without logging
🤖 Prompt for agentsOptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
|

Describe your changes:
Fixes #23568
This contribution is part of the WeMakeDevs X OpenMetadata Hackathon.
I worked on implementing a centralized, configurable database query retry mechanism because transient DBA-enforced statement timeouts (
QueryCanceled/SQLSTATE 57014) were causing the ingestion framework to swallowOperationalErrorexceptions, resulting in incomplete metadata capture in Postgres and Greenplum environments.The Implementation:
queryRetryConfigtodatabaseServiceMetadataPipeline.jsonallowing operators to configure backoff parameters (default:enabled=Falseto prevent breaking existing deployments).@db_retryinmetadata.utils.db_retryfeaturing exponential backoff with jitter and a 3-layer transient error detection system (SQLSTATE, Exception Names, DBAPI connection state).SELECTqueries across base classes (CommonDbSourceService,SqlColumnHandlerMixin) and dialect-specific overrides (Postgres,Greenplum).How did you test your changes?
test_db_retry.pyverifying SQLSTATE detection priorities, backoff math boundaries, config extraction, and decorator metadata preservation. All 50 tests are passing.make py_formatand achieved a10.00/10pylint score on all modified files.Screenshots:

Here is the successful test run output proving the 50/50 green sweep across the retry utility tests:
Also ran make py_format and make lint on the code

Type of change:
Checklist:
I have read the CONTRIBUTING document.
My PR title is
Fixes #23568: Implement Configurable DB Query Retry MechanismI have commented on my code, particularly in hard-to-understand areas.
For JSON Schema changes: I updated the migration scripts or explained why it is not needed. (Note: Schema additive only, no migration required).
I have added tests around the new logic.
For connector/ingestion changes: I updated the documentation.
The issue properly describes why the new feature is needed, what's the goal, and how we are building it. Any discussion
or decision-making process is reflected in the issue.
I have updated the documentation.
I have added tests around the new logic.
Summary by Gitar
inspect.isgeneratorfunctionand mid-iteration materialization to ensure robustness during retries._normalize_configto track when configuration values are clamped to safe ranges._sanitize_excto prevent potential SQL or parameter data leaks in logs.This will update automatically on new commits.