Skip to content

Conversation

@muraee
Copy link
Contributor

@muraee muraee commented Nov 6, 2025

Summary

This PR enables per-NodePool customization of the HAProxy image used for worker node API server proxy through the annotation hypershift.openshift.io/haproxy-image.

Problem

Currently, there is no way to override the HAProxy image on a per-NodePool basis. The image is determined globally by:

  • Environment variable IMAGE_SHARED_INGRESS_HAPROXY when shared ingress is enabled
  • Hardcoded default when shared ingress is enabled
  • Release payload image (default)

This limitation prevents users from testing custom HAProxy images or applying image overrides to specific NodePools without affecting the entire cluster.

Solution

Introduced a new NodePool annotation hypershift.openshift.io/haproxy-image that allows users to specify a custom HAProxy image per NodePool. The image resolution follows this priority order:

  1. NodePool annotation (highest priority) - hypershift.openshift.io/haproxy-image
  2. Environment variable - IMAGE_SHARED_INGRESS_HAPROXY when shared ingress enabled
  3. Hardcoded default - when shared ingress enabled
  4. Release payload - default fallback

Implementation Details

API Changes

  • Added HAProxyImageAnnotation constant to api/hypershift/v1beta1/hostedcluster_types.go

Core Logic

  • Introduced resolveHAProxyImage() function in nodepool_controller.go to handle image resolution with proper priority
  • Updated generateHAProxyRawConfig() to use the new resolution function
  • Updated all call sites to use the refactored logic

Testing

  • Added comprehensive unit tests in TestResolveHAProxyImage covering all priority scenarios:
    • NodePool annotation override
    • NodePool annotation overriding shared ingress
    • Empty annotation falling back to shared ingress
    • No annotation with shared ingress enabled
    • No annotation with shared ingress disabled (release payload)

Documentation

  • Created user documentation at docs/content/how-to/automated-machine-management/haproxy-image-override.md
  • Includes usage examples, troubleshooting, and best practices

Testing Performed

✅ All unit tests pass
✅ Linting passes (make lint-fix)
✅ Verification passes (make verify)
✅ 6 test scenarios covering all image resolution priority cases

Related Issues


🤖 Generated with Claude Code

@openshift-ci openshift-ci bot requested review from csrwng and jparrill November 6, 2025 14:27
@openshift-ci openshift-ci bot added the area/api Indicates the PR includes changes for the API label Nov 6, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 6, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: muraee

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added area/documentation Indicates the PR includes changes for documentation approved Indicates a PR has been approved by an approver from all required OWNERS files. area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release and removed do-not-merge/needs-area labels Nov 6, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 6, 2025

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Excluded labels (none allowed) (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Added a NodePool-scoped HAProxy image override constant and implemented precedence-based HAProxy image resolution (NodePool annotation → shared ingress override → release payload), updating function signatures, call sites, docs, and tests accordingly.

Changes

Cohort / File(s) Summary
API Types
api/hypershift/v1beta1/hostedcluster_types.go
Added new public constant HAProxyImageAnnotation = "hypershift.openshift.io/haproxy-image" for per-NodePool HAProxy image override.
Documentation
docs/content/how-to/automated-machine-management/haproxy-image-override.md
New how-to documenting HAProxy image override behavior, priority rules, examples, verification, rollout notes, and troubleshooting.
Controller: image resolution & signature change
hypershift-operator/controllers/nodepool/nodepool_controller.go
Added resolveHAProxyImage to determine HAProxy image by priority (NodePool annotation → shared ingress → release payload); updated generateHAProxyRawConfig signature to accept nodePool; adjusted imports.
Call Site Updates
hypershift-operator/controllers/nodepool/conditions.go, hypershift-operator/controllers/nodepool/secret_janitor.go, hypershift-operator/controllers/nodepool/nodepool_controller.go (reconcile/token paths)
Updated calls to generateHAProxyRawConfig to pass nodePool where required.
HAProxy Helper
hypershift-operator/controllers/nodepool/apiserver-haproxy/haproxy.go
Removed images import and eliminated prior SharedIngress-based HAProxy image override logic.
Tests
hypershift-operator/controllers/nodepool/nodepool_controller_test.go
Added TestResolveHAProxyImage covering annotation cases, empty annotation, shared ingress enabled/disabled, and release fallback.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Pay attention to the image resolution precedence and handling of empty/invalid annotation values.
  • Verify every call site updated for the new generateHAProxyRawConfig signature.
  • Review tests (TestResolveHAProxyImage) to ensure they cover all relevant combinations and edge cases.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
hypershift-operator/controllers/nodepool/nodepool_controller_test.go (1)

1479-1606: Great coverage; add a case for shared-ingress default (no env var) and whitespace annotation

  • Add a test where shared ingress is enabled but IMAGE_SHARED_INGRESS_HAPROXY is unset to assert the hardcoded default is used.
  • Optional: once trim logic is added (see resolver comment), add a case where annotation is " " and confirm it falls back.

Apply this minimal diff to include the default shared-ingress case:

@@
-import (
+import (
   "fmt"
   "regexp"
   "strings"
   "testing"
@@
-  fakereleaseprovider "github.com/openshift/hypershift/support/releaseinfo/fake"
+  fakereleaseprovider "github.com/openshift/hypershift/support/releaseinfo/fake"
+  "github.com/openshift/hypershift/support/images"
@@
   testCases := []struct {
     name                string
     nodePoolAnnotations map[string]string
     useSharedIngress    bool
     envVarImage         string
     expectedImage       string
   }{
@@
+    {
+      name:             "When shared ingress enabled and no env var set it should use hardcoded default",
+      useSharedIngress: true,
+      expectedImage:    images.GetSharedIngressHAProxyImage(),
+    },
   }
docs/content/how-to/automated-machine-management/haproxy-image-override.md (2)

14-20: Document the shared‑ingress default explicitly

Consider adding a note that when shared ingress is enabled and IMAGE_SHARED_INGRESS_HAPROXY is unset, a built‑in default image is used.


51-57: Minor: use plural resource name in CLI examples

Most kubectl verbs prefer plural resources. Example: kubectl annotate nodepools my-nodepool -n clusters ....

hypershift-operator/controllers/nodepool/nodepool_controller.go (1)

979-999: Edge case: missing haproxy component in release image

Error message is clear. Consider wrapping with the current release version for faster triage, e.g., include releaseImage.Version() in the error.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between fd07723 and 96f2de9.

⛔ Files ignored due to path filters (1)
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (7)
  • api/hypershift/v1beta1/hostedcluster_types.go (1 hunks)
  • docs/content/how-to/automated-machine-management/haproxy-image-override.md (1 hunks)
  • hypershift-operator/controllers/nodepool/apiserver-haproxy/haproxy.go (0 hunks)
  • hypershift-operator/controllers/nodepool/conditions.go (1 hunks)
  • hypershift-operator/controllers/nodepool/nodepool_controller.go (4 hunks)
  • hypershift-operator/controllers/nodepool/nodepool_controller_test.go (2 hunks)
  • hypershift-operator/controllers/nodepool/secret_janitor.go (1 hunks)
💤 Files with no reviewable changes (1)
  • hypershift-operator/controllers/nodepool/apiserver-haproxy/haproxy.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Red Hat Konflux / hypershift-operator-main-on-pull-request
🔇 Additional comments (8)
hypershift-operator/controllers/nodepool/secret_janitor.go (1)

96-99: LGTM on updated call signature

Passes nodePool into generateHAProxyRawConfig; flow and errors preserved.

hypershift-operator/controllers/nodepool/nodepool_controller_test.go (1)

13-13: Import alias for haproxy constants is appropriate

Keeps tests decoupled from component image name string.

api/hypershift/v1beta1/hostedcluster_types.go (1)

53-57: Annotation constant addition is clear and accurate

Name and docs match resolver precedence; scoped to NodePool usage.

hypershift-operator/controllers/nodepool/nodepool_controller.go (4)

16-20: New imports are appropriate for shared‑ingress image resolution

No impact on existing behavior outside HAProxy image selection.


341-344: Call-site update to pass NodePool is correct

Consistent with new generator signature; error handling unchanged.


418-421: Token path updated consistently

generateHAProxyRawConfig now receives nodePool; good.


1003-1017: Generator uses resolved image correctly

Encapsulates image choice; keeps haproxy generation unchanged.

hypershift-operator/controllers/nodepool/conditions.go (1)

328-331: Plumbing NodePool into HAProxy config generation verified

All 4 call sites (secret_janitor.go:96, conditions.go:328, nodepool_controller.go:341 and 418) correctly pass nodePool. Function signature and all invocations are consistent.

@muraee muraee changed the title feat(nodepool): add HAProxy image override via annotation CNTRLPLANE-1551: feat(nodepool): add HAProxy image override via annotation Nov 6, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 6, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 6, 2025

@muraee: This pull request references CNTRLPLANE-1551 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

Summary

This PR enables per-NodePool customization of the HAProxy image used for worker node API server proxy through the annotation hypershift.openshift.io/haproxy-image.

Problem

Currently, there is no way to override the HAProxy image on a per-NodePool basis. The image is determined globally by:

  • Environment variable IMAGE_SHARED_INGRESS_HAPROXY when shared ingress is enabled
  • Hardcoded default when shared ingress is enabled
  • Release payload image (default)

This limitation prevents users from testing custom HAProxy images or applying image overrides to specific NodePools without affecting the entire cluster.

Solution

Introduced a new NodePool annotation hypershift.openshift.io/haproxy-image that allows users to specify a custom HAProxy image per NodePool. The image resolution follows this priority order:

  1. NodePool annotation (highest priority) - hypershift.openshift.io/haproxy-image
  2. Environment variable - IMAGE_SHARED_INGRESS_HAPROXY when shared ingress enabled
  3. Hardcoded default - when shared ingress enabled
  4. Release payload - default fallback

Implementation Details

API Changes

  • Added HAProxyImageAnnotation constant to api/hypershift/v1beta1/hostedcluster_types.go

Core Logic

  • Introduced resolveHAProxyImage() function in nodepool_controller.go to handle image resolution with proper priority
  • Updated generateHAProxyRawConfig() to use the new resolution function
  • Updated all call sites to use the refactored logic

Testing

  • Added comprehensive unit tests in TestResolveHAProxyImage covering all priority scenarios:
  • NodePool annotation override
  • NodePool annotation overriding shared ingress
  • Empty annotation falling back to shared ingress
  • No annotation with shared ingress enabled
  • No annotation with shared ingress disabled (release payload)

Documentation

  • Created user documentation at docs/content/how-to/automated-machine-management/haproxy-image-override.md
  • Includes usage examples, troubleshooting, and best practices

Testing Performed

✅ All unit tests pass
✅ Linting passes (make lint-fix)
✅ Verification passes (make verify)
✅ 6 test scenarios covering all image resolution priority cases

Related Issues


🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@muraee muraee force-pushed the CNTRLPLANE-1551-haproxy-image-override branch from 96f2de9 to 900158f Compare November 6, 2025 14:48
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
docs/content/how-to/automated-machine-management/haproxy-image-override.md (3)

75-94: Consider adding verification for the token secret content.

The verification steps are helpful. Consider adding a step to check the NodePool's token secret to verify the HAProxy image is in the ignition configuration before checking the worker node itself.

Example addition after line 79:

2. Check the token secret for the HAProxy image reference:
   ```bash
   kubectl get secret <nodepool-name>-token -n <controlplane-namespace> -o jsonpath='{.data.payload}' | base64 -d | grep haproxy

---

`96-103`: **Remove extra blank line.**

The important notes are comprehensive and accurate. However, there's an unnecessary extra blank line at line 103.


Apply this diff:

```diff
 5. **Multiple NodePools**: Each NodePool can have a different HAProxy image override
 
-
 ## Troubleshooting

104-126: Consider adding example commands for troubleshooting.

The troubleshooting guidance is helpful. Consider enhancing it with specific commands to inspect NodePool conditions or retrieve ignition configuration from the token secret.

Example addition after line 118:

   ```bash
   # Check NodePool status conditions
   kubectl get nodepool <nodepool-name> -n clusters -o jsonpath='{.status.conditions}' | jq
   
   # Inspect token secret ignition config
   kubectl get secret <nodepool-name>-token -n <controlplane-namespace> -o jsonpath='{.data.payload}' | base64 -d | jq '.storage.files[] | select(.path | contains("kube-apiserver-proxy"))'

</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: CodeRabbit UI

**Review profile**: CHILL

**Plan**: Pro

**Cache: Disabled due to data retention organization setting**

**Knowledge base: Disabled due to `Reviews -> Disable Knowledge Base` setting**

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 96f2de914754055a90bbf5691b6825d345efd02f and 900158fa135552eebb1881a2a327820992f27065.

</details>

<details>
<summary>⛔ Files ignored due to path filters (1)</summary>

* `vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go` is excluded by `!vendor/**`, `!**/vendor/**`

</details>

<details>
<summary>📒 Files selected for processing (7)</summary>

* `api/hypershift/v1beta1/hostedcluster_types.go` (1 hunks)
* `docs/content/how-to/automated-machine-management/haproxy-image-override.md` (1 hunks)
* `hypershift-operator/controllers/nodepool/apiserver-haproxy/haproxy.go` (0 hunks)
* `hypershift-operator/controllers/nodepool/conditions.go` (1 hunks)
* `hypershift-operator/controllers/nodepool/nodepool_controller.go` (4 hunks)
* `hypershift-operator/controllers/nodepool/nodepool_controller_test.go` (2 hunks)
* `hypershift-operator/controllers/nodepool/secret_janitor.go` (1 hunks)

</details>

<details>
<summary>💤 Files with no reviewable changes (1)</summary>

* hypershift-operator/controllers/nodepool/apiserver-haproxy/haproxy.go

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (4)</summary>

* api/hypershift/v1beta1/hostedcluster_types.go
* hypershift-operator/controllers/nodepool/secret_janitor.go
* hypershift-operator/controllers/nodepool/conditions.go
* hypershift-operator/controllers/nodepool/nodepool_controller.go

</details>

<details>
<summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)</summary>

* GitHub Check: Red Hat Konflux / hypershift-operator-main-on-pull-request

</details>

<details>
<summary>🔇 Additional comments (4)</summary><blockquote>

<details>
<summary>hypershift-operator/controllers/nodepool/nodepool_controller_test.go (2)</summary><blockquote>

`13-13`: **LGTM: Import addition is appropriate.**

The haproxy package import is correctly used in the test to reference `HAProxyRouterImageName`.

---

`1479-1614`: **LGTM: Well-structured test with comprehensive coverage.**

The test function is well-designed with table-driven tests covering the image resolution priority order. The setup properly mocks all dependencies including the client, release provider, and environment variables. The test cases validate annotation precedence, shared ingress fallback, and release payload fallback scenarios.

</blockquote></details>
<details>
<summary>docs/content/how-to/automated-machine-management/haproxy-image-override.md (2)</summary><blockquote>

`1-20`: **LGTM: Clear overview and correct priority order.**

The documentation accurately describes the HAProxy image resolution priority, which matches the implementation described in the PR objectives.

---

`21-68`: **LGTM: Examples are syntactically correct and practical.**

The YAML configuration example and kubectl commands are accurate and provide clear guidance for users to apply and remove the annotation.

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

@muraee muraee force-pushed the CNTRLPLANE-1551-haproxy-image-override branch from 900158f to b5c903d Compare November 6, 2025 15:02
@muraee muraee marked this pull request as draft November 6, 2025 15:03
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 6, 2025
This change enables per-NodePool customization of the HAProxy image
used for worker node API server proxy through the annotation
hypershift.openshift.io/haproxy-image.

The implementation includes:
- New annotation constant HAProxyImageAnnotation in API types
- Image resolution function with priority order:
  1. NodePool annotation (highest priority)
  2. Environment variable when shared ingress enabled
  3. Hardcoded default when shared ingress enabled
  4. Release payload (default)
- Updated all call sites to use the new resolution logic
- Comprehensive unit tests validating all priority scenarios
- User documentation with examples and troubleshooting

The refactored resolveHAProxyImage() function is independently
testable, avoiding complex HAProxy config generation setup.

Signed-off-by: Mulham Raee <[email protected]>
Commit-Message-Assisted-by: Claude (via Claude Code)
@muraee muraee force-pushed the CNTRLPLANE-1551-haproxy-image-override branch from b5c903d to 95392ef Compare November 6, 2025 16:59
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 6, 2025

@muraee: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aks b5c903d link true /test e2e-aks
ci/prow/e2e-aks-4-20 b5c903d link true /test e2e-aks-4-20
ci/prow/unit b5c903d link true /test unit
ci/prow/okd-scos-e2e-aws-ovn b5c903d link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api Indicates the PR includes changes for the API area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants