Skip to content

OCPBUGS-62619: Add etcd size limit validation for rendered MachineConfigs#5729

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
dkhater-redhat:fix-etcd-size-limit-validation
Mar 26, 2026
Merged

OCPBUGS-62619: Add etcd size limit validation for rendered MachineConfigs#5729
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
dkhater-redhat:fix-etcd-size-limit-validation

Conversation

@dkhater-redhat
Copy link
Contributor

Fixes bug where MachineConfigPools get stuck in degraded state with "etcdserver: request is too large" errors when rendered MachineConfigs exceed etcd's 1.5MB size limit.

Changes:

  • Add MaxMachineConfigSize constant (1572864 bytes) in constants.go
  • Add ValidateMachineConfigSize() function in helpers.go that:
    • Validates rendered MC size before sending to etcd
    • Returns clear error message with remediation guidance
    • Logs warning when size exceeds 80% of limit
    • Provides debug logging of MC size usage
  • Call validation in render controller before MC create/update

This prevents the operator from attempting to write oversized MCs to etcd, provides early detection with helpful error messages, and avoids wasting retry attempts. The error message specifically mentions large registry mirror configurations (ImageDigestMirrorSet/ICSP) as the primary cause and suggests reducing their size.

- What I did

- How to verify it

- Description for the changelog

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 3, 2026
@dkhater-redhat dkhater-redhat force-pushed the fix-etcd-size-limit-validation branch from cb489af to 3b4256d Compare March 3, 2026 08:28
Fixes bug where MachineConfigPools get stuck in degraded state with
"etcdserver: request is too large" errors when rendered MachineConfigs
exceed etcd's 1.5MB size limit.

Changes:
- Add MaxMachineConfigSize constant (1572864 bytes) in constants.go
- Add ValidateMachineConfigSize() function in helpers.go that:
  * Validates rendered MC size before sending to etcd
  * Returns clear error message with remediation guidance
  * Logs warning when size exceeds 80% of limit
  * Provides debug logging of MC size usage
- Call validation in render controller before MC create/update

This prevents the operator from attempting to write oversized MCs to
etcd, provides early detection with helpful error messages, and avoids
wasting retry attempts. The error message specifically mentions large
registry mirror configurations (ImageDigestMirrorSet/ICSP) as the
primary cause and suggests reducing their size.
@dkhater-redhat dkhater-redhat force-pushed the fix-etcd-size-limit-validation branch from 3b4256d to fdf1f44 Compare March 3, 2026 08:30
@dkhater-redhat dkhater-redhat changed the title Add etcd size limit validation for rendered MachineConfigs OCPBUGS-62619: Add etcd size limit validation for rendered MachineConfigs Mar 3, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Mar 3, 2026
@openshift-ci-robot
Copy link
Contributor

@dkhater-redhat: This pull request references Jira Issue OCPBUGS-62619, which is invalid:

  • expected the bug to target the "4.22.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Fixes bug where MachineConfigPools get stuck in degraded state with "etcdserver: request is too large" errors when rendered MachineConfigs exceed etcd's 1.5MB size limit.

Changes:

  • Add MaxMachineConfigSize constant (1572864 bytes) in constants.go
  • Add ValidateMachineConfigSize() function in helpers.go that:
  • Validates rendered MC size before sending to etcd
  • Returns clear error message with remediation guidance
  • Logs warning when size exceeds 80% of limit
  • Provides debug logging of MC size usage
  • Call validation in render controller before MC create/update

This prevents the operator from attempting to write oversized MCs to etcd, provides early detection with helpful error messages, and avoids wasting retry attempts. The error message specifically mentions large registry mirror configurations (ImageDigestMirrorSet/ICSP) as the primary cause and suggests reducing their size.

- What I did

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@dkhater-redhat
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Mar 3, 2026
@openshift-ci-robot
Copy link
Contributor

@dkhater-redhat: This pull request references Jira Issue OCPBUGS-62619, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sergiordlr

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

1 similar comment
@openshift-ci-robot
Copy link
Contributor

@dkhater-redhat: This pull request references Jira Issue OCPBUGS-62619, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sergiordlr

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from sergiordlr March 3, 2026 08:33
@dkhater-redhat
Copy link
Contributor Author

/retest-required

Copy link
Member

@isabella-janssen isabella-janssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Looks good, and I especially like the warning at 80% capacity!

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 3, 2026
@dkhater-redhat
Copy link
Contributor Author

/retest-required

1 similar comment
@dkhater-redhat
Copy link
Contributor Author

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 4, 2026

@dkhater-redhat: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op-ocl-part1 fdf1f44 link false /test e2e-gcp-op-ocl-part1
ci/prow/e2e-gcp-op-ocl fdf1f44 link false /test e2e-gcp-op-ocl
ci/prow/e2e-gcp-op-ocl-part2 fdf1f44 link false /test e2e-gcp-op-ocl-part2

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@dkhater-redhat
Copy link
Contributor Author

/test e2e-gcp-op-part1

Copy link

@HarshwardhanPatil07 HarshwardhanPatil07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre Merge Verification

Environment

  • OpenShift 4.21.0-0.nightly-2026-03-22-203205 upgraded to 4.22.0-0-2026-03-24-131506
  • Platform: AWS

Step 1: Verify Cluster Health (4.21 nightly)

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ export KUBECONFIG=/home/harshpat/Downloads/kubeconfig

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.21.0-0.nightly-2026-03-22-203205   True        False         174m    Cluster version is 4.21.0-0.nightly-2026-03-22-203205

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-22c218032c7a3d69a4122396bdc1e60d   True      False      False      3              3                   3                     0                      3h19m
worker   rendered-worker-b33cbf0e1944a9cd424b8862c8d0840e   True      False      False      3              3                   3                     0                      3h19m

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get nodes
NAME                                        STATUS   ROLES                  AGE     VERSION
ip-10-0-1-118.us-east-2.compute.internal    Ready    worker                 3h14m   v1.34.5
ip-10-0-18-24.us-east-2.compute.internal    Ready    control-plane,master   3h22m   v1.34.5
ip-10-0-32-37.us-east-2.compute.internal    Ready    control-plane,master   3h21m   v1.34.5
ip-10-0-50-33.us-east-2.compute.internal    Ready    worker                 3h14m   v1.34.5
ip-10-0-82-62.us-east-2.compute.internal    Ready    control-plane,master   3h22m   v1.34.5
ip-10-0-87-200.us-east-2.compute.internal   Ready    worker                 3h14m   v1.34.5

Step 2: Check Baseline Rendered MachineConfig Size

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get mc rendered-master-22c218032c7a3d69a4122396bdc1e60d -o json | wc -c
198490

Baseline rendered MC is ~198KB, i.e. under the 1.5MB etcd limit.

Step 3: Generate and Apply Large ImageDigestMirrorSet (IDMS) Objects

Created a script to generate 20 IDMS objects, each with 100 mirror entries (3 mirrors per source), to inflate the registries.conf and push the rendered MC close to 1.5MB.

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ bash generate-idms.sh
=== Current rendered MC size (baseline) ===
198490

=== Applying IDMS objects ===
imagedigestmirrorset.config.openshift.io/large-idms-001 created
Applied large-idms-001
imagedigestmirrorset.config.openshift.io/large-idms-002 created
Applied large-idms-002
imagedigestmirrorset.config.openshift.io/large-idms-003 created
Applied large-idms-003
imagedigestmirrorset.config.openshift.io/large-idms-004 created
Applied large-idms-004
imagedigestmirrorset.config.openshift.io/large-idms-005 created
Applied large-idms-005
imagedigestmirrorset.config.openshift.io/large-idms-006 created
Applied large-idms-006
imagedigestmirrorset.config.openshift.io/large-idms-007 created
Applied large-idms-007
imagedigestmirrorset.config.openshift.io/large-idms-008 created
Applied large-idms-008
imagedigestmirrorset.config.openshift.io/large-idms-009 created
Applied large-idms-009
imagedigestmirrorset.config.openshift.io/large-idms-010 created
Applied large-idms-010
imagedigestmirrorset.config.openshift.io/large-idms-011 created
Applied large-idms-011
imagedigestmirrorset.config.openshift.io/large-idms-012 created
Applied large-idms-012
imagedigestmirrorset.config.openshift.io/large-idms-013 created
Applied large-idms-013
imagedigestmirrorset.config.openshift.io/large-idms-014 created
Applied large-idms-014
imagedigestmirrorset.config.openshift.io/large-idms-015 created
Applied large-idms-015
imagedigestmirrorset.config.openshift.io/large-idms-016 created
Applied large-idms-016
imagedigestmirrorset.config.openshift.io/large-idms-017 created
Applied large-idms-017
imagedigestmirrorset.config.openshift.io/large-idms-018 created
Applied large-idms-018
imagedigestmirrorset.config.openshift.io/large-idms-019 created
Applied large-idms-019
imagedigestmirrorset.config.openshift.io/large-idms-020 created
Applied large-idms-020

=== All IDMS objects applied. Waiting 90s for MC regeneration ===

=== Checking IDMS count ===
21

=== Checking 99-master-generated-registries MC size ===
1243163

=== Checking registries.conf content size ===
927657

=== Checking rendered MC size ===
1435305

=== MCP status ===
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-dab8a91ac8cc77665a27a9f4b0dfc6b9   True      False      False      3              3                   3                     0                      3h23m
worker   rendered-worker-65a35e5f9b1d9df882a9fd9968d0b2d1   True      False      False      3              3                   3                     0                      3h23m

After 20 IDMS objects: rendered MC is ~1.4MB, MCP still healthy (just under the limit).

Step 4: Upgrade Cluster to 4.22 Using ClusterBot Image

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc adm upgrade --to-image=registry.build07.ci.openshift.org/ci-ln-qrzrj9t/release:latest --allow-explicit-upgrade --force
warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates. You have used --allow-explicit-upgrade for the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. Only use this if you are testing unsigned release images or you are working around a known bug in the cluster-version operator and you have verified the authenticity of the provided image yourself.
Requested update to release image registry.build07.ci.openshift.org/ci-ln-qrzrj9t/release:latest

Monitored upgrade progress:

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.21.0-0.nightly-2026-03-22-203205   True        True          78s     Working towards 4.22.0-0-2026-03-24-131506-test-ci-ln-qrzrj9t-latest: 123 of 984 done (12% complete), waiting on etcd, kube-apiserver

Waited for upgrade to complete:

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get clusterversion
NAME      VERSION                                                AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.22.0-0-2026-03-24-131506-test-ci-ln-qrzrj9t-latest   True        False         38m     Cluster version is 4.22.0-0-2026-03-24-131506-test-ci-ln-qrzrj9t-latest

Step 5: Check Rendered MC Size After Upgrade

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get mc rendered-master-7c7e3261d5aa100e599e67d4e969cc7e -o json | wc -c
1441689

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get mc 99-master-generated-registries -o json | wc -c
1250336

Rendered MC at ~1.44MB after upgrade — still under 1.5MB limit by ~131KB. Added 3 more IDMS objects to push it over.

Step 6: Add Additional IDMS Objects to Exceed 1.5MB Limit

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ for i in $(seq 21 23); do
>   cat <<EOF | oc apply -f -
> apiVersion: config.openshift.io/v1
> kind: ImageDigestMirrorSet
> metadata:
>   name: large-idms-$(printf "%03d" $i)
> spec:
>   imageDigestMirrors:
> $(for j in $(seq 1 100); do
> cat <<ENTRY
>   - mirrors:
>     - mirror-registry-${i}-${j}-a.example.com/org${i}/repo${j}
>     - mirror-registry-${i}-${j}-b.example.com/org${i}/repo${j}
>     - mirror-registry-${i}-${j}-c.example.com/org${i}/repo${j}
>     source: source-registry-${i}-${j}.example.com/org${i}/repo${j}
> ENTRY
> done)
> EOF
>   echo "Applied large-idms-$(printf "%03d" $i)"
> done
imagedigestmirrorset.config.openshift.io/large-idms-021 created
Applied large-idms-021
imagedigestmirrorset.config.openshift.io/large-idms-022 created
Applied large-idms-022
imagedigestmirrorset.config.openshift.io/large-idms-023 created
Applied large-idms-023

Step 7: Verify Bug Reproduction — MCP Degraded

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-7e9dd6f2a77167ec04b7a765f755fa88   True      False      True       3              3                   3                     0                      7h32m
worker   rendered-worker-a758bee8d64e5a7312082d23d9a3d2ac   True      False      True       3              3                   3                     0                      7h32m

Both master and worker MCPs are DEGRADED=True.

MCP Master Conditions:

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get mcp master -o jsonpath='{range .status.conditions[*]}{.type}: {.status} - {.message}{"\n"}{end}'
PinnedImageSetsDegraded: False -
NodeDegraded: False -
RenderDegraded: True - Failed to render configuration for pool master: size validation failed: rendered MachineConfig rendered-master-2577e9f84fb4cf97fc235ba314c538b3 is too large (1612325 bytes, max 1572864 bytes). This will exceed etcd's size limit. Consider reducing the number or size of MachineConfigs, particularly large registry mirror configurations (ImageDigestMirrorSet/ImageContentSourcePolicy)
Degraded: True -
Updated: True - All nodes are updated with MachineConfig rendered-master-7e9dd6f2a77167ec04b7a765f755fa88
Updating: False -

MCP Worker Conditions:

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get mcp worker -o jsonpath='{range .status.conditions[*]}{.type}: {.status} - {.message}{"\n"}{end}'
PinnedImageSetsDegraded: False -
NodeDegraded: False -
RenderDegraded: True - Failed to render configuration for pool worker: size validation failed: rendered MachineConfig rendered-worker-4f5cc9d23653637a819db8ad423f6b03 is too large (1610456 bytes, max 1572864 bytes). This will exceed etcd's size limit. Consider reducing the number or size of MachineConfigs, particularly large registry mirror configurations (ImageDigestMirrorSet/ImageContentSourcePolicy)
Degraded: True -
Updated: True - All nodes are updated with MachineConfig rendered-worker-a758bee8d64e5a7312082d23d9a3d2ac
Updating: False -

Machine-Config-Controller Logs:

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc logs -n openshift-machine-config-operator deployment/machine-config-controller --tail=20 | grep -i "too large\|Error syncing\|Dropping"
E0324 15:16:13.383482       1 render_controller.go:545] Error syncing Generated MCFG: size validation failed: rendered MachineConfig rendered-worker-4f5cc9d23653637a819db8ad423f6b03 is too large (1610456 bytes, max 1572864 bytes). This will exceed etcd's size limit. Consider reducing the number or size of MachineConfigs, particularly large registry mirror configurations (ImageDigestMirrorSet/ImageContentSourcePolicy)
I0324 15:16:13.392159       1 render_controller.go:478] Error syncing machineconfigpool worker: size validation failed: rendered MachineConfig rendered-worker-4f5cc9d23653637a819db8ad423f6b03 is too large (1610456 bytes, max 1572864 bytes). This will exceed etcd's size limit. Consider reducing the number or size of MachineConfigs, particularly large registry mirror configurations (ImageDigestMirrorSet/ImageContentSourcePolicy)
E0324 15:16:28.440763       1 render_controller.go:545] Error syncing Generated MCFG: size validation failed: rendered MachineConfig rendered-master-2577e9f84fb4cf97fc235ba314c538b3 is too large (1612325 bytes, max 1572864 bytes). This will exceed etcd's size limit. Consider reducing the number or size of MachineConfigs, particularly large registry mirror configurations (ImageDigestMirrorSet/ImageContentSourcePolicy)
I0324 15:16:28.450421       1 render_controller.go:478] Error syncing machineconfigpool master: size validation failed: rendered MachineConfig rendered-master-2577e9f84fb4cf97fc235ba314c538b3 is too large (1612325 bytes, max 1572864 bytes). This will exceed etcd's size limit. Consider reducing the number or size of MachineConfigs, particularly large registry mirror configurations (ImageDigestMirrorSet/ImageContentSourcePolicy)
I0324 15:16:30.903916       1 render_controller.go:484] Dropping machineconfigpool "worker" out of the queue: size validation failed: rendered MachineConfig rendered-worker-4f5cc9d23653637a819db8ad423f6b03 is too large (1610456 bytes, max 1572864 bytes). This will exceed etcd's size limit. Consider reducing the number or size of MachineConfigs, particularly large registry mirror configurations (ImageDigestMirrorSet/ImageContentSourcePolicy)

Rendered MC and Registries Size:

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get mc $(oc get mcp master -o jsonpath='{.spec.configuration.name}') -o json | wc -c
1566321

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get mc 99-master-generated-registries -o json | wc -c
1437280

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get mc 99-master-generated-registries -o json | jq -r '.spec.config.storage.files[0].contents.source' | cut -d',' -f2 | base64 -d | wc -c
1067865

harshpat@harshpat-thinkpadp1gen4i:~/Downloads$ oc get imagedigestmirrorset
NAME             AGE
large-idms-001   4h
large-idms-002   4h
large-idms-003   4h
large-idms-004   4h
large-idms-005   4h
large-idms-006   4h
large-idms-007   4h
large-idms-008   4h
large-idms-009   4h
large-idms-010   4h
large-idms-011   4h
large-idms-012   4h
large-idms-013   4h
large-idms-014   4h
large-idms-015   4h
large-idms-016   4h
large-idms-017   4h
large-idms-018   4h
large-idms-019   4h
large-idms-020   4h
large-idms-021   23m
large-idms-022   23m
large-idms-023   23m

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 25, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dkhater-redhat, HarshwardhanPatil07, isabella-janssen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [dkhater-redhat,isabella-janssen]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@HarshwardhanPatil07
Copy link

mcc logs did confirmed with

W0324 14:35:51.329496       1 helpers.go:501] MachineConfig rendered-master-7c7e3261d5aa100e599e67d4e969cc7e is approaching size limit: 1425381 bytes (90.62% of 1572864 byte limit). Consider reducing MachineConfig size to avoid hitting the limit.

@isabella-janssen
Copy link
Member

/verified by @HarshwardhanPatil07

See #5729 (review)

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 25, 2026
@openshift-ci-robot
Copy link
Contributor

@isabella-janssen: This PR has been marked as verified by @HarshwardhanPatil07.

Details

In response to this:

/verified by @HarshwardhanPatil07

See #5729 (review)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link

coderabbitai bot commented Mar 25, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f899a3f3-6eb9-40e8-ab06-fd345c9e62a2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dkhater-redhat
Copy link
Contributor Author

/restest-required

@dkhater-redhat
Copy link
Contributor Author

/retest-required

@openshift-merge-bot openshift-merge-bot bot merged commit e814e20 into openshift:main Mar 26, 2026
15 of 18 checks passed
@openshift-ci-robot
Copy link
Contributor

@dkhater-redhat: Jira Issue Verification Checks: Jira Issue OCPBUGS-62619
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-62619 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

Fixes bug where MachineConfigPools get stuck in degraded state with "etcdserver: request is too large" errors when rendered MachineConfigs exceed etcd's 1.5MB size limit.

Changes:

  • Add MaxMachineConfigSize constant (1572864 bytes) in constants.go
  • Add ValidateMachineConfigSize() function in helpers.go that:
  • Validates rendered MC size before sending to etcd
  • Returns clear error message with remediation guidance
  • Logs warning when size exceeds 80% of limit
  • Provides debug logging of MC size usage
  • Call validation in render controller before MC create/update

This prevents the operator from attempting to write oversized MCs to etcd, provides early detection with helpful error messages, and avoids wasting retry attempts. The error message specifically mentions large registry mirror configurations (ImageDigestMirrorSet/ICSP) as the primary cause and suggests reducing their size.

- What I did

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants