OCPBUGS-75200: Configure kubelet GracefulNodeShutdown#5708
OCPBUGS-75200: Configure kubelet GracefulNodeShutdown#5708saschagrunert wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
@saschagrunert: This pull request references Jira Issue OCPBUGS-75200, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@saschagrunert: This pull request references Jira Issue OCPBUGS-75200, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@saschagrunert: This pull request references Jira Issue OCPBUGS-75200, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@saschagrunert: This pull request references Jira Issue OCPBUGS-75200, which is valid. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
ddaebae to
3e7e032
Compare
|
@saschagrunert: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/cfe1d6f0-1303-11f1-83e3-11d85e3b80d8-0 |
|
@saschagrunert: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/1c16a8c0-1304-11f1-9ce3-b8000e827635-0 |
|
@saschagrunert: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/759a6550-1320-11f1-828e-df656025dabc-0 |
|
@saschagrunert: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5044fab0-132d-11f1-8357-d7060a5fcd59-0 |
|
@saschagrunert: This pull request references Jira Issue OCPBUGS-75200, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@saschagrunert: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/4d6b6c48-1611-11f1-906d-d05b4787918c-0 |
|
@saschagrunert: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/4eef332e-1611-11f1-904a-755af07f4b25-0 |
|
@saschagrunert: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/55c864cc-1611-11f1-88c4-0219bf41a9c9-0 |
|
@cheesesashimi @dkhater-redhat PTAL for approval |
|
@ngopalak-redhat @eggfoobar PTAL for review |
|
/lgtm |
|
@haircommander @rphillips PTAL |
|
/hold we hit issues with this in the past so we may want to make sure those are resolved. I fear they're fundamental to the current state of GNS and we may need to wait for eviction requests to move forward first. |
|
I looked into alternatives to GNS for solving this. Looks like there is not much of an alternative. Without
GNS was previously turned off due to test failures around networking DaemonSet/static pods. The key question is whether those issues still exist in current kubelet/CRI-O versions. A few things that limit the blast radius of this PR:
It would help to understand which specific networking DS/static pod failures were seen before, so we can confirm they're resolved. If those are reproducible with this PR, we'd need to look into them. As a possible follow-up we could also add |
|
Have you checked why these two jobs are failing on gcp? |
Enable kubelet's GracefulNodeShutdown by setting shutdownGracePeriod and shutdownGracePeriodCriticalPods in the kubelet configuration templates. Without these settings, kubelet exits immediately on SIGTERM during node reboots without terminating pods, causing kube-apiserver to be SIGKILLed when its graceful shutdown exceeds systemd's 90s timeout. Values: - Master/arbiter: 270s total, 240s for critical pods - Worker: 90s total, 60s for critical pods - SNO: disabled (MCO skips drain and uses short grace periods) The 240s critical pod budget provides sufficient headroom above the longest kube-apiserver terminationGracePeriodSeconds (194s on AWS) without requiring platform-specific logic. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
3e7e032 to
21c5da8
Compare
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
WalkthroughThis PR adds shutdown grace period configuration to kubelet across master, worker, and arbiter node roles. The settings (shutdownGracePeriod: 270s and shutdownGracePeriodCriticalPods: 240s) are conditionally applied when control plane topology is not SingleReplica. A new test validates these configurations across multiple platform and node role scenarios. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
|
@saschagrunert: This pull request references Jira Issue OCPBUGS-75200, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ngopalak-redhat, saschagrunert The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@damdo: This PR was included in a payload test run from openshift/cluster-machine-approver#295 |
|
/retest |
|
@saschagrunert: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
- What I did
OpenShift never configures kubelet's
shutdownGracePeriodin KubeletConfiguration, making GracefulNodeShutdown a no-op. During MCO-triggered reboots:systemctl rebootvia logindDefaultTimeoutStopSecis 90sThis is a latent bug exposed by new MCO changes introducing additional master reboots:
--pod-infra-container-image, all node reboots)autoSizingReserved, worker reboots)Configures
shutdownGracePeriodandshutdownGracePeriodCriticalPodsin kubelet templates:terminationGracePeriodSecondsof 194s on AWS with headroom)Kubelet automatically overrides logind's
InhibitDelayMaxSecand acquires a delay inhibitor lock so thatsystemctl rebootwaits for graceful pod termination.- How to verify it
go test ./pkg/controller/kubelet-config/... -run TestShutdownGracePeriod -vgo test ./pkg/controller/template/... -vshutdownGracePeriod[sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully extendedtest- Payload test results
e2e-aws-ovn)kubelet terminates kube-apiserver gracefully+extendedboth passede2e-gcp-ovn)kubelet terminates kube-apiserver gracefully+extendedboth passede2e-aws-ovn-ocl)oc wait --for=createnot supported- Description for the changelog
Configure kubelet GracefulNodeShutdown with generous grace periods to prevent kube-apiserver from being SIGKILLed during node reboots.
Summary by CodeRabbit
Tests
New Features