Skip to content

MCO-1972: Removes OSImageURLConfig from the build controller#5424

Open
cheesesashimi wants to merge 6 commits intoopenshift:mainfrom
cheesesashimi:zzlotnik/osimageurl-redux
Open

MCO-1972: Removes OSImageURLConfig from the build controller#5424
cheesesashimi wants to merge 6 commits intoopenshift:mainfrom
cheesesashimi:zzlotnik/osimageurl-redux

Conversation

@cheesesashimi
Copy link
Member

@cheesesashimi cheesesashimi commented Nov 18, 2025

- What I did

This decouples the Build Controller from OSImageURLConfig and makes the OSImageURL and BaseOSExtensionsImage fields on the rendered MachineConfig the source of truth for the base OS and extensions images to use for Image-Mode OpenShift. The idea is that if a different OS image is selected on a per-pool basis (e.g., one is RHEL9 and one is RHEL10 for dual-streams), then the Build Controller should use the appropriate source of truth for the appropriate pool.

However, if one also sets the OSImageStream name on the MachineConfigPool and also sets OSImageURL on a MachineConfig, the MCO should degrade in this state because it would override value provided by the cluster admin. This PR also includes an E2E test which verifies that this is the case. This new E2E test will not be automatically ran until openshift/release#75329 is merged.

- How to verify it

The best way to verify this is to create a cluster and then create a MachineConfig which overrides the OSImageURL value. The Build Controller should build a new OS image based upon the new OSImageURL value.

- Description for the changelog
MachineConfigs should be the source of truth for the Build Controller

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 18, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 18, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@cheesesashimi
Copy link
Member Author

/test unit verify e2e-gcp-op-ocl

BaseOSContainerImage: m.MachineConfig.Spec.OSImageURL,
BaseOSExtensionsContainerImage: m.MachineConfig.Spec.BaseOSExtensionsContainerImage,
// This value is purposely left empty because the ConfigMap does not actually
// populate this value. However, we want the hashing to be stable.
Copy link
Member Author

@cheesesashimi cheesesashimi Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer: This might be a moot point if someone is upgrading from one OCP release to another, the hashes will change. However, that means that old images may get rebuilt in the process, which is undesirable.

@cheesesashimi cheesesashimi changed the title Removes OSImageURLConfig from the build controller MCO-1972: Removes OSImageURLConfig from the build controller Nov 18, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 18, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Nov 18, 2025

@cheesesashimi: This pull request references MCO-1972 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

- What I did

This decouples the Build Controller from OSImageURLConfig and makes the OSImageURL and BaseOSExtensionsImage fields on the rendered MachineConfig the source of truth for the base OS and extensions images to use for Image-Mode OpenShift. The idea is that if a different OS image is selected on a per-pool basis (e.g., one is RHEL9 and one is RHEL10 for dual-streams), then the Build Controller should use the appropriate source of truth for the appropriate pool.

- How to verify it

The best way to verify this is to create a cluster and then create a MachineConfig which overrides the OSImageURL value. The Build Controller should build a new OS image based upon the new OSImageURL value.

- Description for the changelog
MachineConfigs should be the source of truth for the Build Controller

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 18, 2025
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-ocl

@cheesesashimi cheesesashimi force-pushed the zzlotnik/osimageurl-redux branch from 49a39f6 to 3ffaec0 Compare February 5, 2026 15:25
@cheesesashimi
Copy link
Member Author

/test unit verify e2e-gcp-op-ocl

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-ocl

@cheesesashimi cheesesashimi marked this pull request as ready for review February 6, 2026 14:34
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 6, 2026
@cheesesashimi
Copy link
Member Author

/test unit

@cheesesashimi cheesesashimi force-pushed the zzlotnik/osimageurl-redux branch from adb5a66 to b66544e Compare February 10, 2026 15:41
@pablintino
Copy link
Contributor

/retest-required
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 11, 2026
@ptalgulk01
Copy link

Pre-merge verified:

Environment Setup:
OCP Version: 4.22.0-0-2026-02-18-060304-test-ci-ln-ti6cfjk-latest
Platform: AWS

Pre-requisites

  • Create Container file push to quay.io/mcoqe/layering repo
  • get the sha256 for the image

Steps

  • apply the MC with OSImageURL
oc create -f - << EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker 
  name: os-layer-custom
spec:
  osImageURL: "quay.io/mcoqe/layering@sha256:11b1ab30f33dd92b43c92be37f35033c0ddc0955fce144a3e2d7a81d5790a559"
> EOF
machineconfig.machineconfiguration.openshift.io/os-layer-custom created
  • Wait for the MCP update to complete
  • Check the image is applied on node
$ oc debug node/ip-10-0-21-13.us-east-2.compute.internal -- chroot /host rpm-ostree status
Starting pod/ip-10-0-21-13us-east-2computeinternal-debug-6mczg ...
To use host binaries, run `chroot /host`
State: idle
Deployments:
* ostree-unverified-registry:quay.io/mcoqe/layering@sha256:11b1ab30f33dd92b43c92be37f35033c0ddc0955fce144a3e2d7a81d5790a559
                   Digest: sha256:11b1ab30f33dd92b43c92be37f35033c0ddc0955fce144a3e2d7a81d5790a559
                  Version: 9.8.20260214-0 (2026-02-18T13:06:30Z)

Removing debug pod ...

$ oc get mc rendered-worker-13ea0dd3cdbf18ac59647ba5de9d4e8a  -o jsonpath='{.spec.osImageURL}'
quay.io/mcoqe/layering@sha256:11b1ab30f33dd92b43c92be37f35033c0ddc0955fce144a3e2d7a81d5790a559

@cheesesashimi
Copy link
Member Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 24, 2026
@cheesesashimi
Copy link
Member Author

/test verify

@cheesesashimi cheesesashimi force-pushed the zzlotnik/osimageurl-redux branch from b66544e to 352592e Compare February 27, 2026 17:01
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Feb 27, 2026
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Feb 27, 2026

@cheesesashimi: This pull request references MCO-1972 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

- What I did

This decouples the Build Controller from OSImageURLConfig and makes the OSImageURL and BaseOSExtensionsImage fields on the rendered MachineConfig the source of truth for the base OS and extensions images to use for Image-Mode OpenShift. The idea is that if a different OS image is selected on a per-pool basis (e.g., one is RHEL9 and one is RHEL10 for dual-streams), then the Build Controller should use the appropriate source of truth for the appropriate pool.

However, if one also sets the OSImageStream name on the MachineConfigPool and also sets OSImageURL on a MachineConfig, the MCO should degrade in this state because it would override value provided by the cluster admin. This PR also includes an E2E test which verifies that this is the case.

- How to verify it

The best way to verify this is to create a cluster and then create a MachineConfig which overrides the OSImageURL value. The Build Controller should build a new OS image based upon the new OSImageURL value.

- Description for the changelog
MachineConfigs should be the source of truth for the Build Controller

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-techpreview

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2026

@pablintino: Overrode contexts on behalf of pablintino: ci/prow/e2e-gcp-op-part2, ci/prow/e2e-hypershift

Details

In response to this:

/lgtm
/override ci/prow/e2e-gcp-op-part2 ci/prow/e2e-hypershift
Overriding known, unrelated, issues.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 5f0d9d7 and 2 for PR HEAD 6303082 in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 3474828 and 1 for PR HEAD 6303082 in total

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 14, 2026
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link

coderabbitai bot commented Mar 14, 2026

Walkthrough

The changes refactor build request and MachineOSBuild construction to source OS image configuration from MachineConfig objects rather than OSImageURLConfig ConfigMaps. This involves removing the OSImageURLConfig field from structs, eliminating API-based constructors that depend on Kubernetes clients, and updating reconcilers and tests to fetch MachineConfig directly. A new validation prevents simultaneous OSImageURL override and OSImageStream configuration.

Changes

Cohort / File(s) Summary
Build Request Core Refactoring
pkg/controller/build/buildrequest/buildrequest.go, buildrequest_test.go
Updated template inputs and environment variables to source BaseOSImage and ExtensionsImage from MachineConfig.Spec instead of OSImageURLConfig; refactored tests to use fixture constants directly.
BuildRequestOpts Struct Updates
pkg/controller/build/buildrequest/buildrequestopts.go, buildrequestopts_test.go
Removed OSImageURLConfig field and associated initialization logic; removed validation of opts.OSImageURLConfig in constructor.
MachineOSBuild Constructor Changes
pkg/controller/build/buildrequest/machineosbuild.go, machineosbuild_test.go
Removed three API-based constructors (NewMachineOSBuildOpts, NewMachineOSBuildFromAPIOrDie, NewMachineOSBuildFromAPI); replaced OSImageURLConfig field with MachineConfig in MachineOSBuildOpts; added validation of rendered MachineConfig metadata and annotations.
Test Fixtures Updates
pkg/controller/build/fixtures/objects.go
Added RenderedMachineConfig field to ObjectsForTest; updated newMachineConfigsFromPool to return both regular and rendered MachineConfigs; removed getOSImageURLConfigMap() and OSImageURLConfig() helpers.
OSBuild Controller Test Updates
pkg/controller/build/osbuildcontroller_test.go
Updated test helpers to return *fixtures.ObjectsForTest and supply lobj.RenderedMachineConfig to MachineOSBuildOpts; modified rendered machine config insertion functions to return *mcfgv1.MachineConfig instead of void; removed Ignition file construction logic.
Reconciler Updates
pkg/controller/build/reconciler.go
Updated build request construction to fetch MachineConfig from lister instead of OSImageURLConfig via ConfigMap; added explicit MachineConfig lookups with error handling in multiple reconciliation paths.
Common Images Module
pkg/controller/common/images.go
Removed exported GetOSImageURLConfig() function; retained ParseOSImageURLConfigMap() for parsing logic only.
Render Controller Validation
pkg/controller/render/render_controller.go, render_controller_test.go
Added validation to prevent simultaneous OSImageURL override and OSImageStream configuration; returns error if both conditions are detected; added corresponding test case.
E2E Test Updates
test/e2e-ocl-1of2/onclusterlayering_test.go, test/e2e-ocl-2of2/onclusterlayering_test.go, test/e2e-ocl/onclusterlayering_test.go
Updated MachineOSBuild construction to use new options-based constructor with fetched MachineConfig; removed assertBuildJobIsAsExpected validation helper and its invocations.
New E2E Test
test/e2e-techpreview/osimagestreamrender_test.go
Added comprehensive test suite for OSImageStream/OSImageURL interaction, including degradation/recovery validation and helper functions for stream retrieval, MachineConfig creation, pool configuration, and degradation polling.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Ensures that a cluster admin may only override the OSImageURL field or
set the desired OSImageStream name; but not both. This ensures that
either the cluster admin or the MCO will manage the OS image and
prevents the MCO from overriding this setting.
@cheesesashimi cheesesashimi force-pushed the zzlotnik/osimageurl-redux branch from 6303082 to 4b77c26 Compare March 19, 2026 17:11
@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Mar 19, 2026
@openshift-ci openshift-ci bot removed lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 19, 2026
@djoshy
Copy link
Contributor

djoshy commented Mar 19, 2026

/lgtm

@djoshy
Copy link
Contributor

djoshy commented Mar 19, 2026

/verified by @ptalgulk01

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 19, 2026
@openshift-ci-robot
Copy link
Contributor

@djoshy: This PR has been marked as verified by @ptalgulk01.

Details

In response to this:

/verified by @ptalgulk01

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 19, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 19, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheesesashimi, djoshy, pablintino

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [cheesesashimi,djoshy,pablintino]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/e2e-techpreview/osimagestreamrender_test.go (2)

96-97: Consider capturing the return value from WaitForRenderedConfig.

The returned rendered config name is discarded. While the test may not need it, capturing it could be useful for debugging if the test fails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e-techpreview/osimagestreamrender_test.go` around lines 96 - 97, The
call to helpers.WaitForRenderedConfig currently discards its return value;
change the call in osimagestreamrender_test.go to capture the returned rendered
config name (e.g., renderedName := helpers.WaitForRenderedConfig(t, cs,
poolName, "00-worker")) and then use that variable in subsequent debug output or
t.Logf to aid failure diagnosis; keep the helper invocation and parameters the
same but store and log the return from WaitForRenderedConfig for easier
debugging.

27-29: Consider increasing test timeout for multiple subtests.

The 5-minute context timeout applies to the entire test including all three recovery path subtests. Each subtest involves pool creation, degradation, and recovery operations. This timeout might be tight in slower CI environments.

💡 Suggested adjustment
 func TestOSImageStreamOSImageURL(t *testing.T) {
-	ctx, cancel := context.WithTimeout(context.Background(), time.Minute*5)
+	ctx, cancel := context.WithTimeout(context.Background(), time.Minute*10)
 	t.Cleanup(cancel)

Alternatively, consider creating a fresh context with timeout per subtest.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e-techpreview/osimagestreamrender_test.go` around lines 27 - 29, The
test-wide 5-minute context in TestOSImageStreamOSImageURL may be too short for
all three recovery-path subtests; either increase the top-level timeout (e.g.,
to 10+ minutes) by adjusting the context.WithTimeout call in
TestOSImageStreamOSImageURL, or create a fresh per-subtest context.WithTimeout
inside each t.Run (within the subtest closures that perform pool
creation/degradation/recovery) so each subtest gets its own ample timeout and
calls t.Cleanup(cancel).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/e2e-techpreview/osimagestreamrender_test.go`:
- Around line 96-97: The call to helpers.WaitForRenderedConfig currently
discards its return value; change the call in osimagestreamrender_test.go to
capture the returned rendered config name (e.g., renderedName :=
helpers.WaitForRenderedConfig(t, cs, poolName, "00-worker")) and then use that
variable in subsequent debug output or t.Logf to aid failure diagnosis; keep the
helper invocation and parameters the same but store and log the return from
WaitForRenderedConfig for easier debugging.
- Around line 27-29: The test-wide 5-minute context in
TestOSImageStreamOSImageURL may be too short for all three recovery-path
subtests; either increase the top-level timeout (e.g., to 10+ minutes) by
adjusting the context.WithTimeout call in TestOSImageStreamOSImageURL, or create
a fresh per-subtest context.WithTimeout inside each t.Run (within the subtest
closures that perform pool creation/degradation/recovery) so each subtest gets
its own ample timeout and calls t.Cleanup(cancel).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: aa44a399-4222-4df0-84c5-8e9784cab920

📥 Commits

Reviewing files that changed from the base of the PR and between 66ed315 and 4b77c26.

📒 Files selected for processing (16)
  • pkg/controller/build/buildrequest/buildrequest.go
  • pkg/controller/build/buildrequest/buildrequest_test.go
  • pkg/controller/build/buildrequest/buildrequestopts.go
  • pkg/controller/build/buildrequest/buildrequestopts_test.go
  • pkg/controller/build/buildrequest/machineosbuild.go
  • pkg/controller/build/buildrequest/machineosbuild_test.go
  • pkg/controller/build/fixtures/objects.go
  • pkg/controller/build/osbuildcontroller_test.go
  • pkg/controller/build/reconciler.go
  • pkg/controller/common/images.go
  • pkg/controller/render/render_controller.go
  • pkg/controller/render/render_controller_test.go
  • test/e2e-ocl-1of2/onclusterlayering_test.go
  • test/e2e-ocl-2of2/onclusterlayering_test.go
  • test/e2e-ocl/onclusterlayering_test.go
  • test/e2e-techpreview/osimagestreamrender_test.go
💤 Files with no reviewable changes (2)
  • pkg/controller/build/buildrequest/buildrequestopts_test.go
  • pkg/controller/common/images.go

@djoshy
Copy link
Contributor

djoshy commented Mar 19, 2026

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 19, 2026

@cheesesashimi: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-hypershift 4b77c26 link true /test e2e-hypershift
ci/prow/e2e-aws-ovn 4b77c26 link true /test e2e-aws-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD c564b22 and 2 for PR HEAD 4b77c26 in total

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 25, 2026

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@djoshy djoshy removed their assignment Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. qe-approved Signifies that QE has signed off on this PR verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants