-
-
Notifications
You must be signed in to change notification settings - Fork 36.1k
Add retry_after to UpdateFailed in update coordinator #153550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retry_after to UpdateFailed in update coordinator #153550
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a proof-of-concept backoff strategy mechanism for Home Assistant's DataUpdateCoordinator to handle rate limiting from APIs. The feature allows integrations to specify a delay period when raising UpdateFailed exceptions, enabling the coordinator to defer its next scheduled refresh appropriately.
Key changes:
- Added
retry_afterparameter toUpdateFailedexception for specifying backoff delays - Modified coordinator's scheduling logic to honor retry delays after failed updates
- Added comprehensive test coverage for the new backoff behavior
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| homeassistant/helpers/update_coordinator.py | Core implementation of retry_after mechanism in UpdateFailed exception and DataUpdateCoordinator scheduling logic |
| tests/helpers/test_update_coordinator.py | Comprehensive test cases covering retry_after behavior during setup and normal operation phases |
| homeassistant/components/portainer/coordinator.py | Minor debug log message change from detailed endpoint info to simple "Finished" |
|
Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍 |
Co-authored-by: Martin Hjelmare <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Martin Hjelmare <[email protected]>
Co-authored-by: Martin Hjelmare <[email protected]>
|
@MartinHjelmare did I miss a review comment to be worked out? Or can we continue this PR? :) |
|
|
||
| except UpdateFailed as err: | ||
| self.last_exception = err | ||
| # We can only honor a retry_after, after the config entry has been set up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make this limitation clear in the dev blog, I think. We could consider allowing the retry_after parameter to influence the config entry retry interval, but I think that requires a new discussion.
MartinHjelmare
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
Please link a dev docs update and blog post. Please also fill in the PR template checklist. |
|
Thanks @MartinHjelmare! I'll work out a dev blog. :) |
|
Blog and docs PR provided. Marking this as review for review to bring it to attention. |
Breaking change
Proposed change
Proof of Concept for an architectural discussion. Please treat it as such and doesn't require a review at the moment.
Architecture discussion: home-assistant/architecture#1286
The
update_coordinator.pynow includes aretry_afterparameter in theUpdateFailedexception. The coordinator now defers its next scheduled refresh by that many seconds and then resumes the normal cadence.Considerations:
Currently it only accepts only an integer. Open for discussion if this should be a float/timedelta to allow more "jitternish" controlled from the Integration Owners.retry_after. This means, the integrations and the API clients are/need to be able to detect its rate-limited.Effectively meaning integration owners need to do sanitionzed of the Retry-After header (be it an Int or Datetime), or find out in any other way the desired backoff from the API server. The integration owner dictates the backoff period.
retry_afterisn't capped. Perhaps allow a maximum?retry_afteronly works after a successful config setup. Effectively meaning that if the UpdateFailed is called fromasync_config_entry_first_refreshit will still show the default behavior:ConfigEntryNotReadyis actually raised. From a functional view I think this is right, but also from a technical view, the_schedule_refresh()would not hit in this scenario.For this is (mis)use the
raise_on_entry_errorto determine if the coordinator is in a setup phase.Two tests are submitted:
UpdateFailedis triggered viaasync_config_entry_first_refresh.UpdateFailedis raised, without and with theretry_after. In the case of without, it should resolve to the else-statement and show normal reschedule behavior of 10 seconds.In case of the
retry_afterscenario it's tested the 60 seconds reschedule is done, then cleared and upon a newschedule_refreshto demonstrate it defaulted back to the 10 seconds of the mock default. Basically to test the reschedule behavior is being reset properly.Type of change
Additional information
Checklist
ruff format homeassistant tests)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest.requirements_all.txt.Updated by running
python3 -m script.gen_requirements_all.To help with the load of incoming pull requests: