Skip to content

Conversation

@ljluestc
Copy link

Title

feat(instanceset): allow updating crashed/unavailable pods while preserving rolling guarantees

Description

  • Problem: ITS only updated Ready/Available pods, blocking recovery for crashed pods (e.g., OOMKilled) when users wanted to increase resources.
  • Solution: Relax per-pod gating so NotReady/Unavailable pods can be updated. Preserve rollout safety by consuming maxUnavailable only when taking a currently-Available pod offline; already-unavailable pods don’t consume it.

Changes

  • pkg/controller/instanceset/reconciler_update.go
    • Don’t block updates on NotReady/Unavailable/role-not-ready; keep image-mismatch guard.
    • Track unavailability consumption per pod; only Available→update consumes maxUnavailable.
  • pkg/controller/instanceset/reconciler_update_test.go
    • Add test: updates a crashed/unavailable pod with outdated revision.

Risks and mitigations

  • More permissive updates when degraded; still bounded by:
    • Image consistency checks.
    • Rolling quotas (replicas, maxUnavailable).
    • Member update rules for roleful sets.

How to verify

# From repo root
go test ./pkg/controller/instanceset -count=1 -v

…erving rolling guarantees; refine maxUnavailable accounting; add test
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@apecloud-bot
Copy link
Collaborator

Auto Cherry-pick Instructions

Usage:
  - /nopick: Not auto cherry-pick when PR merged.
  - /pick: release-x.x [release-x.x]: Auto cherry-pick to the specified branch when PR merged.

Example:
  - /nopick
  - /pick release-1.0

@github-actions github-actions bot added the size/M Denotes a PR that changes 30-99 lines. label Nov 11, 2025
@apecloud-bot apecloud-bot added the pre-approve Fork PR Pre Approve Test label Nov 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pre-approve Fork PR Pre Approve Test size/M Denotes a PR that changes 30-99 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants