Skip to content

OCPNODE-3880: Add criocredentialproviderconfig event handler#5487

Open
QiWang19 wants to merge 1 commit intoopenshift:mainfrom
QiWang19:addcriocpcontroller
Open

OCPNODE-3880: Add criocredentialproviderconfig event handler#5487
QiWang19 wants to merge 1 commit intoopenshift:mainfrom
QiWang19:addcriocpcontroller

Conversation

@QiWang19
Copy link
Member

@QiWang19 QiWang19 commented Dec 11, 2025

- What I did
Implement criocredentialprovierconfig that is used by crio-credential-provider plugin to fetch private mirror image pull secrets from the secret object.

The handler creates 97-pool-generated-credentialproviderconfig to rollout configurations to file /etc/kubernetes/credential-providers/[platform]-credential-provider.yaml .

workflow: https://github.com/openshift/enhancements/blob/master/enhancements/api-review/criocredentialproviderconfig-for-namespace-scoped-mirror-authentication.md#workflow-description
- How to verify it

  1. create cluster CRIOCredentialProviderConfig resource, file updated with a new section name: crio-credential-provider
apiVersion: config.openshift.io/v1alpha1
kind: CRIOCredentialProviderConfig
metadata:
  name: cluster
spec:
  matchImages:
  - docker.io
  - 123456789.dkr.ecr.us-east-1.amazonaws.com
  - "*.azurecr.io"
  - gcr.io
  - "*.*.registry.io"
  - registry.io:8080/path
$ oc get mc
97-master-generated-credentialproviderconfig       f6dc328e4c370e377ea5c878aa4c0b8feeb1b181   3.5.0             2s
97-worker-generated-credentialproviderconfig       f6dc328e4c370e377ea5c878aa4c0b8feeb1b181   3.5.0             2s

$ oc describe criocredentialproviderconfig.config.openshift.io/cluster
Name:         cluster
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1alpha1
Kind:         CRIOCredentialProviderConfig
Metadata:
  Creation Timestamp:  2026-02-16T21:26:17Z
  Generation:          1
  Resource Version:    42066
  UID:                 05d173ad-7b8d-4a28-afb9-b587fe565e16
Spec:
  Match Images:
    docker.io
    123456789.dkr.ecr.us-east-1.amazonaws.com
    *.azurecr.io
    gcr.io
    *.*.registry.io
    registry.io:8080/path
Status:
  Conditions:
    Last Transition Time:  2026-02-16T21:26:17Z
    Message:               Success
    Observed Generation:   1
    Reason:                MachineConfigRenderingSucceeded
    Status:                True
    Type:                  MachineConfigRendered
Events:                    <none>

sh-5.1# cat /etc/kubernetes/credential-providers/ecr-credential-provider.yaml 
apiVersion: kubelet.config.k8s.io/v1
kind: CredentialProviderConfig
providers:
- apiVersion: credentialprovider.kubelet.k8s.io/v1
  defaultCacheDuration: 12h0m0s
  matchImages:
  - '*.dkr.ecr.*.amazonaws.com'
  - '*.dkr.ecr.*.amazonaws.com.cn'
  - '*.dkr.ecr-fips.*.amazonaws.com'
  - '*.dkr.ecr.us-iso-east-1.c2s.ic.gov'
  - '*.dkr.ecr.us-isob-east-1.sc2s.sgov.gov'
  name: ecr-credential-provider
- apiVersion: credentialprovider.kubelet.k8s.io/v1
  defaultCacheDuration: 1s
  matchImages:
  - docker.io
  - 123456789.dkr.ecr.us-east-1.amazonaws.com
  - '*.azurecr.io'
  - gcr.io
  - '*.*.registry.io'
  - registry.io:8080/path
  name: crio-credential-provider
  tokenAttributes:
    cacheType: Token
    requireServiceAccount: false
    serviceAccountTokenAudience: https://kubernetes.default.svc

  1. create namespace rolebing allow serviceaccount to get secrect in the namespace e.g. namespace:mynamespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: credential-provider-secret-reader
  namespace: mynamespace 
rules:
- apiGroups: [""]
  resources: ["secrets"] 
  verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: credential-provider-secret-reader-binding
  namespace: mynamespace 
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: credential-provider-secret-reader
subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: User
    name: system:serviceaccount:mynamespace:default
  1. Create ImageDigestMirrorSet/ImageTagMirrorSet mirror configurations
  2. Create a pod containers.image is from mirror source registry
  3. Check crio-credential-provider log journalctl _COMM=crio-credential on the scheduled node

- Description for the changelog

Summary by CodeRabbit

  • New Features

    • Manage CRIO credential providers via a new CRIOCredentialProviderConfig with platform-aware credential-provider config generation and kubelet credential-provider flags on master, worker, and arbiter nodes.
  • Chores / Defaults

    • Cluster defaults updated to add credential-provider file paths and to reload/restart kubelet when those configs change.
  • Security & Permissions

    • Added cluster-scoped RBAC roles/bindings to allow nodes to access service account token-related operations.
  • Tests

    • Added comprehensive unit and controller tests for CRIO credential provider create/update/merge and ignition generation.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 11, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2025
@QiWang19
Copy link
Member Author

/test all

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch 3 times, most recently from 865fa47 to 8a072e4 Compare December 13, 2025 06:29
@QiWang19
Copy link
Member Author

/test all

@QiWang19 QiWang19 changed the title Addcriocpcontroller OCPNODE-3880: Add criocredentialprovierconfig pcontroller Dec 23, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 23, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Dec 23, 2025

@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

- What I did

- How to verify it

apiVersion: config.openshift.io/v1alpha1
kind: CRIOCredentialProviderConfig
metadata:
 name: cluster
spec:
 matchImages:
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - "*.azurecr.io"
 - "*.*.registry.io"
 - registry.io:8080/path
sh-5.1# cat /etc/kubernetes/credential-providers/gcr-credential-provider.yaml 
apiVersion: kubelet.config.k8s.io/v1
kind: CredentialProviderConfig
providers:
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 args:
 - get-credentials
 - --v=3
 defaultCacheDuration: 1m0s
 matchImages:
 - gcr.io
 - '*.gcr.io'
 - '*.pkg.dev'
 - container.cloud.google.com
 name: gcr-credential-provider
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 1s
 matchImages:
 - registry.io:8080/path
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - '*.azurecr.io'
 - '*.*.registry.io'
 name: crio-credential-provider
 tokenAttributes:
   cacheType: Token
   requireServiceAccount: false
   serviceAccountTokenAudience: https://kubernetes.default.svc

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 21, 2026
@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from 8a072e4 to 75dfbfd Compare January 26, 2026 15:35
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 26, 2026
@QiWang19
Copy link
Member Author

/test all

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from 75dfbfd to 511fde6 Compare January 27, 2026 01:46
@QiWang19
Copy link
Member Author

/test all

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from 511fde6 to 0ff0fe8 Compare January 27, 2026 18:42
@QiWang19
Copy link
Member Author

/test all

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from 0ff0fe8 to 5d5008b Compare January 28, 2026 15:25
@QiWang19
Copy link
Member Author

/test all

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from 5d5008b to 7d79743 Compare January 30, 2026 14:15
@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch 2 times, most recently from 147fbf2 to 3027d6a Compare February 15, 2026 16:17
@QiWang19
Copy link
Member Author

/test all

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from 3027d6a to 9a477a8 Compare February 16, 2026 19:53
@QiWang19
Copy link
Member Author

/test all

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from 9a477a8 to 58fa6bc Compare February 16, 2026 20:59
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Feb 16, 2026

@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

- What I did

- How to verify it

create cluster CRIOCredentialProviderConfig resource

apiVersion: config.openshift.io/v1alpha1
kind: CRIOCredentialProviderConfig
metadata:
 name: cluster
spec:
 matchImages:
 - docker.io
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - "*.azurecr.io"
 - gcr.io
 - "*.*.registry.io"
 - registry.io:8080/path
$ oc describe criocredentialproviderconfig.config.openshift.io/cluster
Name:         cluster
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1alpha1
Kind:         CRIOCredentialProviderConfig
Metadata:
 Creation Timestamp:  2026-02-16T21:26:17Z
 Generation:          1
 Resource Version:    42066
 UID:                 05d173ad-7b8d-4a28-afb9-b587fe565e16
Spec:
 Match Images:
   docker.io
   123456789.dkr.ecr.us-east-1.amazonaws.com
   *.azurecr.io
   gcr.io
   *.*.registry.io
   registry.io:8080/path
Status:
 Conditions:
   Last Transition Time:  2026-02-16T21:26:17Z
   Message:               Success
   Observed Generation:   1
   Reason:                MachineConfigRenderingSucceeded
   Status:                True
   Type:                  MachineConfigRendered
Events:                    <none>

sh-5.1# cat /etc/kubernetes/credential-providers/ecr-credential-provider.yaml 
apiVersion: kubelet.config.k8s.io/v1
kind: CredentialProviderConfig
providers:
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 12h0m0s
 matchImages:
 - '*.dkr.ecr.*.amazonaws.com'
 - '*.dkr.ecr.*.amazonaws.com.cn'
 - '*.dkr.ecr-fips.*.amazonaws.com'
 - '*.dkr.ecr.us-iso-east-1.c2s.ic.gov'
 - '*.dkr.ecr.us-isob-east-1.sc2s.sgov.gov'
 name: ecr-credential-provider
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 1s
 matchImages:
 - docker.io
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - '*.azurecr.io'
 - gcr.io
 - '*.*.registry.io'
 - registry.io:8080/path
 name: crio-credential-provider
 tokenAttributes:
   cacheType: Token
   requireServiceAccount: false
   serviceAccountTokenAudience: https://kubernetes.default.svc

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Feb 16, 2026

@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

- What I did

- How to verify it

  1. create cluster CRIOCredentialProviderConfig resource
apiVersion: config.openshift.io/v1alpha1
kind: CRIOCredentialProviderConfig
metadata:
 name: cluster
spec:
 matchImages:
 - docker.io
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - "*.azurecr.io"
 - gcr.io
 - "*.*.registry.io"
 - registry.io:8080/path
$ oc describe criocredentialproviderconfig.config.openshift.io/cluster
Name:         cluster
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1alpha1
Kind:         CRIOCredentialProviderConfig
Metadata:
 Creation Timestamp:  2026-02-16T21:26:17Z
 Generation:          1
 Resource Version:    42066
 UID:                 05d173ad-7b8d-4a28-afb9-b587fe565e16
Spec:
 Match Images:
   docker.io
   123456789.dkr.ecr.us-east-1.amazonaws.com
   *.azurecr.io
   gcr.io
   *.*.registry.io
   registry.io:8080/path
Status:
 Conditions:
   Last Transition Time:  2026-02-16T21:26:17Z
   Message:               Success
   Observed Generation:   1
   Reason:                MachineConfigRenderingSucceeded
   Status:                True
   Type:                  MachineConfigRendered
Events:                    <none>

sh-5.1# cat /etc/kubernetes/credential-providers/ecr-credential-provider.yaml 
apiVersion: kubelet.config.k8s.io/v1
kind: CredentialProviderConfig
providers:
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 12h0m0s
 matchImages:
 - '*.dkr.ecr.*.amazonaws.com'
 - '*.dkr.ecr.*.amazonaws.com.cn'
 - '*.dkr.ecr-fips.*.amazonaws.com'
 - '*.dkr.ecr.us-iso-east-1.c2s.ic.gov'
 - '*.dkr.ecr.us-isob-east-1.sc2s.sgov.gov'
 name: ecr-credential-provider
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 1s
 matchImages:
 - docker.io
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - '*.azurecr.io'
 - gcr.io
 - '*.*.registry.io'
 - registry.io:8080/path
 name: crio-credential-provider
 tokenAttributes:
   cacheType: Token
   requireServiceAccount: false
   serviceAccountTokenAudience: https://kubernetes.default.svc

  1. create namespace rolebing e.g. namespace:mynamespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
 name: credential-provider-secret-reader
 namespace: mynamespace 
rules:
- apiGroups: [""]
 resources: ["secrets"] 
 verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
 name: credential-provider-secret-reader-binding
 namespace: mynamespace 
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: Role
 name: credential-provider-secret-reader
subjects:
 - apiGroup: rbac.authorization.k8s.io
   kind: User
   name: system:serviceaccount:mynamespace:default
  1. Create ImageDigestMirrorSet/ImageTagMirrorSet mirror configurations
  2. Create a pod containers.image is from source
  3. Check crio-credential-provider log journalctl _COMM=crio-credential on the scheduled node

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@QiWang19
Copy link
Member Author

/test all

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 9, 2026

@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

- What I did
Implement criocredentialprovierconfig that is used by crio-credential-provider plugin to fetch private mirror image pull secrets from the secret object.

The handler creates 97-pool-generated-credentialproviderconfig to rollout configurations to file /etc/kubernetes/credential-providers/[platform]-credential-provider.yaml .

workflow: https://github.com/openshift/enhancements/blob/master/enhancements/api-review/criocredentialproviderconfig-for-namespace-scoped-mirror-authentication.md#workflow-description
- How to verify it

  1. create cluster CRIOCredentialProviderConfig resource, file updated with a new section name: crio-credential-provider
apiVersion: config.openshift.io/v1alpha1
kind: CRIOCredentialProviderConfig
metadata:
 name: cluster
spec:
 matchImages:
 - docker.io
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - "*.azurecr.io"
 - gcr.io
 - "*.*.registry.io"
 - registry.io:8080/path
$ oc get mc
97-master-generated-credentialproviderconfig       f6dc328e4c370e377ea5c878aa4c0b8feeb1b181   3.5.0             2s
97-worker-generated-credentialproviderconfig       f6dc328e4c370e377ea5c878aa4c0b8feeb1b181   3.5.0             2s

$ oc describe criocredentialproviderconfig.config.openshift.io/cluster
Name:         cluster
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1alpha1
Kind:         CRIOCredentialProviderConfig
Metadata:
 Creation Timestamp:  2026-02-16T21:26:17Z
 Generation:          1
 Resource Version:    42066
 UID:                 05d173ad-7b8d-4a28-afb9-b587fe565e16
Spec:
 Match Images:
   docker.io
   123456789.dkr.ecr.us-east-1.amazonaws.com
   *.azurecr.io
   gcr.io
   *.*.registry.io
   registry.io:8080/path
Status:
 Conditions:
   Last Transition Time:  2026-02-16T21:26:17Z
   Message:               Success
   Observed Generation:   1
   Reason:                MachineConfigRenderingSucceeded
   Status:                True
   Type:                  MachineConfigRendered
Events:                    <none>

sh-5.1# cat /etc/kubernetes/credential-providers/ecr-credential-provider.yaml 
apiVersion: kubelet.config.k8s.io/v1
kind: CredentialProviderConfig
providers:
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 12h0m0s
 matchImages:
 - '*.dkr.ecr.*.amazonaws.com'
 - '*.dkr.ecr.*.amazonaws.com.cn'
 - '*.dkr.ecr-fips.*.amazonaws.com'
 - '*.dkr.ecr.us-iso-east-1.c2s.ic.gov'
 - '*.dkr.ecr.us-isob-east-1.sc2s.sgov.gov'
 name: ecr-credential-provider
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 1s
 matchImages:
 - docker.io
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - '*.azurecr.io'
 - gcr.io
 - '*.*.registry.io'
 - registry.io:8080/path
 name: crio-credential-provider
 tokenAttributes:
   cacheType: Token
   requireServiceAccount: false
   serviceAccountTokenAudience: https://kubernetes.default.svc

  1. create namespace rolebing allow serviceaccount to get secrect in the namespace e.g. namespace:mynamespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
 name: credential-provider-secret-reader
 namespace: mynamespace 
rules:
- apiGroups: [""]
 resources: ["secrets"] 
 verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
 name: credential-provider-secret-reader-binding
 namespace: mynamespace 
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: Role
 name: credential-provider-secret-reader
subjects:
 - apiGroup: rbac.authorization.k8s.io
   kind: User
   name: system:serviceaccount:mynamespace:default
  1. Create ImageDigestMirrorSet/ImageTagMirrorSet mirror configurations
  2. Create a pod containers.image is from mirror source registry
  3. Check crio-credential-provider log journalctl _COMM=crio-credential on the scheduled node

- Description for the changelog

Summary by CodeRabbit

  • New Features

  • Support for managing container runtime credential provider configs via a new CRIOCredentialProviderConfig and kubelet flag support across node types.

  • Cluster defaults updated to ensure kubelet reload/restart when credential provider configs change.

  • Security & Permissions

  • Added cluster-wide RBAC to allow node credential provider management.

  • Tests

  • Added comprehensive tests covering CRIO credential provider lifecycle and config merging.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
install/0000_80_machine-config_00_rbac.yaml (1)

160-162: ⚠️ Potential issue | 🟠 Major

Scope request-serviceaccounts-token-audience to serviceaccounts only.

This rule still grants the custom verb across resources: ["*"], which is broader than the credential-provider flow needs. Please fold it into the existing serviceaccounts rule instead of granting it cluster-wide on every core resource.

Suggested fix
   - apiGroups: [""]
     resources: ["serviceaccounts"]
-    verbs: ["get", "list"]
-  - apiGroups: [""]
-    resources: ["*"]
-    verbs: ["request-serviceaccounts-token-audience"]
+    verbs: ["get", "list", "request-serviceaccounts-token-audience"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@install/0000_80_machine-config_00_rbac.yaml` around lines 160 - 162, The
manifest currently grants the custom verb
"request-serviceaccounts-token-audience" to resources: ["*"] under apiGroups:
[""], which is too broad; remove that standalone rule and instead add
"request-serviceaccounts-token-audience" to the existing serviceaccounts rule
(the rule that has apiGroups: [""], resources: ["serviceaccounts"] or resources:
["serviceaccounts", ...]) so the verb is scoped only to serviceaccounts; ensure
you remove the resources: ["*"] entry and update the serviceaccounts rule's
verbs array to include "request-serviceaccounts-token-audience" without
duplicating other verbs.
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)

562-565: ⚠️ Potential issue | 🟠 Major

maxRetries still never becomes terminal here.

After Forget(key), re-adding the item with AddAfter turns persistent failures into an endless 1-minute retry loop for every queue using this helper. That defeats the contract implied by maxRetries.

Suggested fix
 	utilruntime.HandleError(err)
 	klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err)
 	queue.Forget(key)
-	queue.AddAfter(key, 1*time.Minute)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`
around lines 562 - 565, The code calls queue.Forget(key) and then
queue.AddAfter(key, 1*time.Minute) which resets retry state and defeats the
intended maxRetries behavior; instead, check the retry count (use
queue.NumRequeues(key) or the workqueue.RateLimitingInterface helper) against
maxRetries and only re-enqueue with AddAfter/AddRateLimited when the count is
below maxRetries, otherwise call queue.Forget(key) and stop requeuing (log
terminal failure). Update the error-handling block around
utilruntime.HandleError and klog.V(2).Infof to implement this conditional
requeue logic using the existing maxRetries variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 1224-1228: The status update is only executed when the reconcile
actually applied changes (applied == true), so no-op reconciles don't advance
CRIOCredentialProviderConfig status; move or add a call to
syncCRIOCredentialProviderConfigStatusOnly(...) so it runs even when applied is
false (i.e., after the applied conditional), ensuring you invoke
syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeMachineConfigRendered,
apicfgv1alpha1.ReasonMachineConfigRenderingSucceeded) on successful
reconciliation/no-op so ObservedGeneration and the rendered condition are
advanced regardless of whether MachineConfigs were modified.

---

Duplicate comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 160-162: The manifest currently grants the custom verb
"request-serviceaccounts-token-audience" to resources: ["*"] under apiGroups:
[""], which is too broad; remove that standalone rule and instead add
"request-serviceaccounts-token-audience" to the existing serviceaccounts rule
(the rule that has apiGroups: [""], resources: ["serviceaccounts"] or resources:
["serviceaccounts", ...]) so the verb is scoped only to serviceaccounts; ensure
you remove the resources: ["*"] entry and update the serviceaccounts rule's
verbs array to include "request-serviceaccounts-token-audience" without
duplicating other verbs.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The code calls queue.Forget(key) and then
queue.AddAfter(key, 1*time.Minute) which resets retry state and defeats the
intended maxRetries behavior; instead, check the retry count (use
queue.NumRequeues(key) or the workqueue.RateLimitingInterface helper) against
maxRetries and only re-enqueue with AddAfter/AddRateLimited when the count is
below maxRetries, otherwise call queue.Forget(key) and stop requeuing (log
terminal failure). Update the error-handling block around
utilruntime.HandleError and klog.V(2).Infof to implement this conditional
requeue logic using the existing maxRetries variable.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5e364b1d-5e56-4d04-922d-2361caed86f3

📥 Commits

Reviewing files that changed from the base of the PR and between 0d2b089 and 9b97140.

📒 Files selected for processing (13)
  • install/0000_80_machine-config_00_rbac.yaml
  • manifests/machineconfigcontroller/clusterrole.yaml
  • pkg/apihelpers/apihelpers.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
  • pkg/controller/container-runtime-config/helpers.go
  • pkg/controller/container-runtime-config/helpers_test.go
  • pkg/daemon/constants/constants.go
  • templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
  • templates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yaml
  • templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
  • templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
  • test/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • manifests/machineconfigcontroller/clusterrole.yaml
  • pkg/controller/container-runtime-config/helpers_test.go
  • pkg/daemon/constants/constants.go
  • test/framework/envtest.go
  • pkg/apihelpers/apihelpers.go
  • templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from 9b97140 to dc39060 Compare March 9, 2026 20:57
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 9, 2026

@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

- What I did
Implement criocredentialprovierconfig that is used by crio-credential-provider plugin to fetch private mirror image pull secrets from the secret object.

The handler creates 97-pool-generated-credentialproviderconfig to rollout configurations to file /etc/kubernetes/credential-providers/[platform]-credential-provider.yaml .

workflow: https://github.com/openshift/enhancements/blob/master/enhancements/api-review/criocredentialproviderconfig-for-namespace-scoped-mirror-authentication.md#workflow-description
- How to verify it

  1. create cluster CRIOCredentialProviderConfig resource, file updated with a new section name: crio-credential-provider
apiVersion: config.openshift.io/v1alpha1
kind: CRIOCredentialProviderConfig
metadata:
 name: cluster
spec:
 matchImages:
 - docker.io
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - "*.azurecr.io"
 - gcr.io
 - "*.*.registry.io"
 - registry.io:8080/path
$ oc get mc
97-master-generated-credentialproviderconfig       f6dc328e4c370e377ea5c878aa4c0b8feeb1b181   3.5.0             2s
97-worker-generated-credentialproviderconfig       f6dc328e4c370e377ea5c878aa4c0b8feeb1b181   3.5.0             2s

$ oc describe criocredentialproviderconfig.config.openshift.io/cluster
Name:         cluster
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1alpha1
Kind:         CRIOCredentialProviderConfig
Metadata:
 Creation Timestamp:  2026-02-16T21:26:17Z
 Generation:          1
 Resource Version:    42066
 UID:                 05d173ad-7b8d-4a28-afb9-b587fe565e16
Spec:
 Match Images:
   docker.io
   123456789.dkr.ecr.us-east-1.amazonaws.com
   *.azurecr.io
   gcr.io
   *.*.registry.io
   registry.io:8080/path
Status:
 Conditions:
   Last Transition Time:  2026-02-16T21:26:17Z
   Message:               Success
   Observed Generation:   1
   Reason:                MachineConfigRenderingSucceeded
   Status:                True
   Type:                  MachineConfigRendered
Events:                    <none>

sh-5.1# cat /etc/kubernetes/credential-providers/ecr-credential-provider.yaml 
apiVersion: kubelet.config.k8s.io/v1
kind: CredentialProviderConfig
providers:
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 12h0m0s
 matchImages:
 - '*.dkr.ecr.*.amazonaws.com'
 - '*.dkr.ecr.*.amazonaws.com.cn'
 - '*.dkr.ecr-fips.*.amazonaws.com'
 - '*.dkr.ecr.us-iso-east-1.c2s.ic.gov'
 - '*.dkr.ecr.us-isob-east-1.sc2s.sgov.gov'
 name: ecr-credential-provider
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 1s
 matchImages:
 - docker.io
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - '*.azurecr.io'
 - gcr.io
 - '*.*.registry.io'
 - registry.io:8080/path
 name: crio-credential-provider
 tokenAttributes:
   cacheType: Token
   requireServiceAccount: false
   serviceAccountTokenAudience: https://kubernetes.default.svc

  1. create namespace rolebing allow serviceaccount to get secrect in the namespace e.g. namespace:mynamespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
 name: credential-provider-secret-reader
 namespace: mynamespace 
rules:
- apiGroups: [""]
 resources: ["secrets"] 
 verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
 name: credential-provider-secret-reader-binding
 namespace: mynamespace 
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: Role
 name: credential-provider-secret-reader
subjects:
 - apiGroup: rbac.authorization.k8s.io
   kind: User
   name: system:serviceaccount:mynamespace:default
  1. Create ImageDigestMirrorSet/ImageTagMirrorSet mirror configurations
  2. Create a pod containers.image is from mirror source registry
  3. Check crio-credential-provider log journalctl _COMM=crio-credential on the scheduled node

- Description for the changelog

Summary by CodeRabbit

  • New Features

  • Manage container runtime credential provider configs via a new CRIOCredentialProviderConfig and kubelet flag support across node types.

  • Kubelet startup now accepts additional flags to enable CRIO image credential provider behavior.

  • Chores / Defaults

  • Cluster defaults updated to ensure kubelet reload/restart when credential provider configs change.

  • Security & Permissions

  • Added cluster-wide RBAC to allow node credential provider management.

  • Tests

  • Added comprehensive tests covering CRIO credential provider lifecycle and config merging.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)

562-565: ⚠️ Potential issue | 🟠 Major

Don't requeue from the terminal retry path.

After maxRetries, this still does Forget and then AddAfter, so broken objects never actually drop out of the queue and will retry forever every minute.

Suggested fix
 	utilruntime.HandleError(err)
 	klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err)
 	queue.Forget(key)
-	queue.AddAfter(key, 1*time.Minute)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`
around lines 562 - 565, The terminal retry path currently calls
utilruntime.HandleError(err), logs via klog.V(2).Infof, then calls
queue.Forget(key) and queue.AddAfter(key, 1*time.Minute) causing infinite
requeues; update the terminal path so after reaching maxRetries you call
utilruntime.HandleError(err) and queue.Forget(key) (or just Forget and return)
but do NOT call queue.AddAfter; locate the logic around maxRetries handling in
container_runtime_config_controller.go where queue.Forget and queue.AddAfter are
invoked and remove the AddAfter call so failed objects are not requeued forever.
pkg/controller/container-runtime-config/container_runtime_config_controller_test.go (1)

659-664: ⚠️ Potential issue | 🟡 Minor

Don't short-circuit the empty-spec verification path.

This still returns before any later assertions run, so TestCrioCredentialProviderConfigCreateEmpty can pass while silently masking mismatches in criocpConfigVerifyOptions for the non-cloud path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`
around lines 659 - 664, The test currently returns early when
verifyOpts.expectMCNilContent is true, which skips later assertions and can mask
mismatches for TestCrioCredentialProviderConfigCreateEmpty; change the block in
the test that checks verifyOpts.expectMCNilContent so it only asserts that
ignCfg.Storage.Files is empty (when expectMCNilContent is true) and do not
return from the test—remove the early return (or scope it so only the specific
check is skipped) so the remaining criocpConfigVerifyOptions-related assertions
still execute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 685-692: The current loop sets allMatch to true when any
criocp.Spec.MatchImages entry is found in provider.MatchImages, so a partial
match passes; change the logic so allMatch reflects that every expected image is
present: initialize allMatch = true before the loop over
criocp.Spec.MatchImages, and for each mi check assert.Contains(t,
provider.MatchImages, string(mi)); if the check fails set allMatch = false and
break (or return) immediately. Use the same symbols (allMatch,
criocp.Spec.MatchImages, provider.MatchImages) so the test correctly requires
every expected image to be present.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 1208-1210: When handling CRIOCredentialProviderConfig overlaps in
syncCRIOCredentialProviderConfigStatusOnly, also clear the stale
ConditionTypeValidated/ReasonConfigurationPartiallyApplied when overlaps are
resolved: if len(overlappedEntries) == 0 call
syncCRIOCredentialProviderConfigStatusOnly with a success/validated condition
(e.g., apicfgv1alpha1.ConditionTypeValidated, ReasonConfigurationApplied or
similar) and a message indicating rendering succeeded so the previous
ConfigurationPartiallyApplied condition is removed; ensure you update the same
condition type (ConditionTypeValidated) rather than only setting it when
overlaps exist.

In `@pkg/controller/container-runtime-config/helpers.go`:
- Line 1383: The parameter name in function generateDropinUnitCredProviderConfig
is misspelled as generticCredProviderConfigPath; rename it to
genericCredProviderConfigPath in the function signature and update all internal
references and callers to use the corrected identifier so the symbol matches
expected spelling (ensure generateDropinUnitCredProviderConfig and any places
that call it compile after the rename).

---

Duplicate comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 659-664: The test currently returns early when
verifyOpts.expectMCNilContent is true, which skips later assertions and can mask
mismatches for TestCrioCredentialProviderConfigCreateEmpty; change the block in
the test that checks verifyOpts.expectMCNilContent so it only asserts that
ignCfg.Storage.Files is empty (when expectMCNilContent is true) and do not
return from the test—remove the early return (or scope it so only the specific
check is skipped) so the remaining criocpConfigVerifyOptions-related assertions
still execute.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The terminal retry path currently calls
utilruntime.HandleError(err), logs via klog.V(2).Infof, then calls
queue.Forget(key) and queue.AddAfter(key, 1*time.Minute) causing infinite
requeues; update the terminal path so after reaching maxRetries you call
utilruntime.HandleError(err) and queue.Forget(key) (or just Forget and return)
but do NOT call queue.AddAfter; locate the logic around maxRetries handling in
container_runtime_config_controller.go where queue.Forget and queue.AddAfter are
invoked and remove the AddAfter call so failed objects are not requeued forever.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6b802971-f882-404a-af1a-cf29785423f4

📥 Commits

Reviewing files that changed from the base of the PR and between 9b97140 and dc39060.

📒 Files selected for processing (13)
  • install/0000_80_machine-config_00_rbac.yaml
  • manifests/machineconfigcontroller/clusterrole.yaml
  • pkg/apihelpers/apihelpers.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
  • pkg/controller/container-runtime-config/helpers.go
  • pkg/controller/container-runtime-config/helpers_test.go
  • pkg/daemon/constants/constants.go
  • templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
  • templates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yaml
  • templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
  • templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
  • test/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (8)
  • pkg/daemon/constants/constants.go
  • templates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yaml
  • install/0000_80_machine-config_00_rbac.yaml
  • test/framework/envtest.go
  • templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
  • templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
  • templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
  • pkg/apihelpers/apihelpers.go

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from dc39060 to d1eef7d Compare March 9, 2026 21:13
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
install/0000_80_machine-config_00_rbac.yaml (1)

156-162: ⚠️ Potential issue | 🟠 Major

Scope request-serviceaccounts-token-audience to serviceaccounts only.

Granting this custom verb on resources: ["*"] gives system:nodes that capability across every core resource. The new credential-provider flow only needs it on serviceaccounts, so this should be merged into the preceding rule instead of expanding the resource scope cluster-wide.

Suggested fix
 rules:
   - apiGroups: [""]
     resources: ["serviceaccounts"]
-    verbs: ["get", "list"]
-  - apiGroups: [""]
-    resources: ["*"]
-    verbs: ["request-serviceaccounts-token-audience"]
+    verbs: ["get", "list", "request-serviceaccounts-token-audience"]
In OpenShift RBAC, should the custom verb `request-serviceaccounts-token-audience` be granted on `resources: ["serviceaccounts"]` rather than `resources: ["*"]`?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@install/0000_80_machine-config_00_rbac.yaml` around lines 156 - 162, The
second rule grants the custom verb "request-serviceaccounts-token-audience"
across all core resources; narrow its scope by moving that verb into the prior
rule that already targets apiGroups: [""] and resources: ["serviceaccounts"] so
the combined rule lists verbs:
["get","list","request-serviceaccounts-token-audience"] (i.e., remove the
separate rule with resources: ["*"] and ensure only serviceaccounts retain the
custom verb).
pkg/controller/container-runtime-config/container_runtime_config_controller.go (2)

549-565: ⚠️ Potential issue | 🟠 Major

Stop requeueing once maxRetries is exhausted.

queue.Forget(key) followed by queue.AddAfter(key, 1*time.Minute) turns permanent failures into an endless 1-minute retry loop, so invalid objects never actually drop out of the queue despite the controller contract above.

Suggested fix
 	utilruntime.HandleError(err)
 	klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err)
 	queue.Forget(key)
-	queue.AddAfter(key, 1*time.Minute)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`
around lines 549 - 565, The current handleQueueErr implementation requeues keys
after maxRetries by calling queue.Forget(key) then queue.AddAfter(key,
1*time.Minute), creating an endless retry loop; modify the logic in
Controller.handleQueueErr so that when queue.NumRequeues(key) >= maxRetries it
only calls queue.Forget(key) (and utilruntime.HandleError(err) / klog as
present) and does not call queue.AddAfter(key, 1*time.Minute) so the item is
dropped permanently instead of being requeued.

1208-1228: ⚠️ Potential issue | 🟠 Major

Clear the stale Validated=False condition when overlaps disappear.

The validated condition is only updated inside the overlap branch. If a user fixes the conflicting matchImages later, this reconcile only writes MachineConfigRendered=True, so the old ConditionTypeValidated / partial-applied status remains stuck on the CR. Update ConditionTypeValidated on the non-overlap path too so meta.SetStatusCondition can replace the stale failure.

Suggested fix
 			if len(overlappedEntries) > 0 {
 				ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationPartiallyApplied, "CRIOCredentialProviderConfig has one or multiple entries that overlap with the original credential provider config. Skip rendering entries: %v.", overlappedEntries)
+			} else {
+				ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, "")
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`
around lines 1208 - 1228, The code only calls
syncCRIOCredentialProviderConfigStatusOnly(...,
apicfgv1alpha1.ConditionTypeValidated,
apicfgv1alpha1.ReasonConfigurationPartiallyApplied, ...) inside the "overlap"
branch, leaving a stale Validated=False when overlaps are later resolved; modify
the non-overlap path (the branch that proceeds to call syncIgnitionConfig and
returns nil/applied) to explicitly clear or set ConditionTypeValidated to a
success state (e.g., call syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeValidated,
apicfgv1alpha1.ReasonConfigurationValidated, "validated") or similar) so
meta.SetStatusCondition replaces the previous failure—add this call just before
returning nil or after successful MachineConfig rendering in the code paths that
use syncIgnitionConfig and when applied is true, referencing
syncCRIOCredentialProviderConfigStatusOnly and ConditionTypeValidated to locate
the insertion point.
🧹 Nitpick comments (1)
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)

67-71: Reuse the exported credential-provider path constants here.

These literals now duplicate pkg/daemon/constants/constants.go, so a later path change can silently desync controller rendering from the daemon-side policy wiring. Please build these from constants.KubernetesCredentialProvidersDir and constants.KubeletCrioImageCredProviderConfPath instead of hardcoding them again.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`
around lines 67 - 71, Replace the hardcoded credential provider path constants
in this file: instead of defining genericCredProviderConfigPath and
kubeletCrioImageCredProviderConfPath as string literals, build them from the
exported constants in pkg/daemon/constants: use
constants.KubernetesCredentialProvidersDir to construct the generic credential
provider config filename (generic-credential-provider.yaml) and use
constants.KubeletCrioImageCredProviderConfPath for the kubelet crio image
credential provider config; update the declarations where builtInLabelKey and
these two path constants are defined so the controller reuses
constants.KubernetesCredentialProvidersDir and
constants.KubeletCrioImageCredProviderConfPath rather than duplicating the
literal paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 156-162: The second rule grants the custom verb
"request-serviceaccounts-token-audience" across all core resources; narrow its
scope by moving that verb into the prior rule that already targets apiGroups:
[""] and resources: ["serviceaccounts"] so the combined rule lists verbs:
["get","list","request-serviceaccounts-token-audience"] (i.e., remove the
separate rule with resources: ["*"] and ensure only serviceaccounts retain the
custom verb).

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 549-565: The current handleQueueErr implementation requeues keys
after maxRetries by calling queue.Forget(key) then queue.AddAfter(key,
1*time.Minute), creating an endless retry loop; modify the logic in
Controller.handleQueueErr so that when queue.NumRequeues(key) >= maxRetries it
only calls queue.Forget(key) (and utilruntime.HandleError(err) / klog as
present) and does not call queue.AddAfter(key, 1*time.Minute) so the item is
dropped permanently instead of being requeued.
- Around line 1208-1228: The code only calls
syncCRIOCredentialProviderConfigStatusOnly(...,
apicfgv1alpha1.ConditionTypeValidated,
apicfgv1alpha1.ReasonConfigurationPartiallyApplied, ...) inside the "overlap"
branch, leaving a stale Validated=False when overlaps are later resolved; modify
the non-overlap path (the branch that proceeds to call syncIgnitionConfig and
returns nil/applied) to explicitly clear or set ConditionTypeValidated to a
success state (e.g., call syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeValidated,
apicfgv1alpha1.ReasonConfigurationValidated, "validated") or similar) so
meta.SetStatusCondition replaces the previous failure—add this call just before
returning nil or after successful MachineConfig rendering in the code paths that
use syncIgnitionConfig and when applied is true, referencing
syncCRIOCredentialProviderConfigStatusOnly and ConditionTypeValidated to locate
the insertion point.

---

Nitpick comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 67-71: Replace the hardcoded credential provider path constants in
this file: instead of defining genericCredProviderConfigPath and
kubeletCrioImageCredProviderConfPath as string literals, build them from the
exported constants in pkg/daemon/constants: use
constants.KubernetesCredentialProvidersDir to construct the generic credential
provider config filename (generic-credential-provider.yaml) and use
constants.KubeletCrioImageCredProviderConfPath for the kubelet crio image
credential provider config; update the declarations where builtInLabelKey and
these two path constants are defined so the controller reuses
constants.KubernetesCredentialProvidersDir and
constants.KubeletCrioImageCredProviderConfPath rather than duplicating the
literal paths.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 049b026b-d651-43da-9a25-0d4750052dd9

📥 Commits

Reviewing files that changed from the base of the PR and between dc39060 and d1eef7d.

📒 Files selected for processing (13)
  • install/0000_80_machine-config_00_rbac.yaml
  • manifests/machineconfigcontroller/clusterrole.yaml
  • pkg/apihelpers/apihelpers.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
  • pkg/controller/container-runtime-config/helpers.go
  • pkg/controller/container-runtime-config/helpers_test.go
  • pkg/daemon/constants/constants.go
  • templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
  • templates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yaml
  • templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
  • templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
  • test/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
  • pkg/apihelpers/apihelpers.go
  • test/framework/envtest.go
  • templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
  • pkg/controller/container-runtime-config/container_runtime_config_controller_test.go

@QiWang19
Copy link
Member Author

QiWang19 commented Mar 9, 2026

/test verify

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from d1eef7d to 5d64533 Compare March 9, 2026 21:51
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 9, 2026

@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

- What I did
Implement criocredentialprovierconfig that is used by crio-credential-provider plugin to fetch private mirror image pull secrets from the secret object.

The handler creates 97-pool-generated-credentialproviderconfig to rollout configurations to file /etc/kubernetes/credential-providers/[platform]-credential-provider.yaml .

workflow: https://github.com/openshift/enhancements/blob/master/enhancements/api-review/criocredentialproviderconfig-for-namespace-scoped-mirror-authentication.md#workflow-description
- How to verify it

  1. create cluster CRIOCredentialProviderConfig resource, file updated with a new section name: crio-credential-provider
apiVersion: config.openshift.io/v1alpha1
kind: CRIOCredentialProviderConfig
metadata:
 name: cluster
spec:
 matchImages:
 - docker.io
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - "*.azurecr.io"
 - gcr.io
 - "*.*.registry.io"
 - registry.io:8080/path
$ oc get mc
97-master-generated-credentialproviderconfig       f6dc328e4c370e377ea5c878aa4c0b8feeb1b181   3.5.0             2s
97-worker-generated-credentialproviderconfig       f6dc328e4c370e377ea5c878aa4c0b8feeb1b181   3.5.0             2s

$ oc describe criocredentialproviderconfig.config.openshift.io/cluster
Name:         cluster
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1alpha1
Kind:         CRIOCredentialProviderConfig
Metadata:
 Creation Timestamp:  2026-02-16T21:26:17Z
 Generation:          1
 Resource Version:    42066
 UID:                 05d173ad-7b8d-4a28-afb9-b587fe565e16
Spec:
 Match Images:
   docker.io
   123456789.dkr.ecr.us-east-1.amazonaws.com
   *.azurecr.io
   gcr.io
   *.*.registry.io
   registry.io:8080/path
Status:
 Conditions:
   Last Transition Time:  2026-02-16T21:26:17Z
   Message:               Success
   Observed Generation:   1
   Reason:                MachineConfigRenderingSucceeded
   Status:                True
   Type:                  MachineConfigRendered
Events:                    <none>

sh-5.1# cat /etc/kubernetes/credential-providers/ecr-credential-provider.yaml 
apiVersion: kubelet.config.k8s.io/v1
kind: CredentialProviderConfig
providers:
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 12h0m0s
 matchImages:
 - '*.dkr.ecr.*.amazonaws.com'
 - '*.dkr.ecr.*.amazonaws.com.cn'
 - '*.dkr.ecr-fips.*.amazonaws.com'
 - '*.dkr.ecr.us-iso-east-1.c2s.ic.gov'
 - '*.dkr.ecr.us-isob-east-1.sc2s.sgov.gov'
 name: ecr-credential-provider
- apiVersion: credentialprovider.kubelet.k8s.io/v1
 defaultCacheDuration: 1s
 matchImages:
 - docker.io
 - 123456789.dkr.ecr.us-east-1.amazonaws.com
 - '*.azurecr.io'
 - gcr.io
 - '*.*.registry.io'
 - registry.io:8080/path
 name: crio-credential-provider
 tokenAttributes:
   cacheType: Token
   requireServiceAccount: false
   serviceAccountTokenAudience: https://kubernetes.default.svc

  1. create namespace rolebing allow serviceaccount to get secrect in the namespace e.g. namespace:mynamespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
 name: credential-provider-secret-reader
 namespace: mynamespace 
rules:
- apiGroups: [""]
 resources: ["secrets"] 
 verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
 name: credential-provider-secret-reader-binding
 namespace: mynamespace 
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: Role
 name: credential-provider-secret-reader
subjects:
 - apiGroup: rbac.authorization.k8s.io
   kind: User
   name: system:serviceaccount:mynamespace:default
  1. Create ImageDigestMirrorSet/ImageTagMirrorSet mirror configurations
  2. Create a pod containers.image is from mirror source registry
  3. Check crio-credential-provider log journalctl _COMM=crio-credential on the scheduled node

- Description for the changelog

Summary by CodeRabbit

  • New Features

  • Manage CRIO credential providers via a new CRIOCredentialProviderConfig with platform-aware credential-provider config generation and kubelet credential-provider flags on master, worker, and arbiter nodes.

  • Chores / Defaults

  • Cluster defaults updated to add credential-provider file paths and to reload/restart kubelet when those configs change.

  • Security & Permissions

  • Added cluster-scoped RBAC roles/bindings to allow nodes to access service account token-related operations.

  • Tests

  • Added comprehensive unit and controller tests for CRIO credential provider create/update/merge and ignition generation.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (4)
test/framework/envtest.go (1)

119-121: ⚠️ Potential issue | 🟡 Minor

Also wire this CRD through the shared envtest helpers.

NewTestEnv now installs CRIOCredentialProviderConfig, but CheckCleanEnvironment, CleanEnvironment, and CreateObjects still don't list/delete/create that type. Any envtest using the common helpers can leak these objects across cases and can't seed them through the shared setup path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/framework/envtest.go` around lines 119 - 121, NewTestEnv installs
CRIOCredentialProviderConfig but the shared helpers CheckCleanEnvironment,
CleanEnvironment, and CreateObjects don't handle that type; update those helpers
to include CRIOCredentialProviderConfig in their lists/operations so envtest
correctly creates, verifies absence, and deletes these objects. Specifically,
add the CRIOCredentialProviderConfig GVK/Kind into the type lists or switch
cases used by CreateObjects, ensure CheckCleanEnvironment also checks for zero
instances of CRIOCredentialProviderConfig, and have CleanEnvironment remove any
CRIOCredentialProviderConfig resources found so tests don't leak them between
cases.
pkg/controller/container-runtime-config/container_runtime_config_controller_test.go (1)

658-663: ⚠️ Potential issue | 🟡 Minor

Don't short-circuit the empty-spec verification path.

expectMCNilContent currently means “no files at all” and returns before the drop-in assertions run. That makes TestCrioCredentialProviderConfigCreateEmpty skip validation of the non-cloud drop-in case entirely, and it would also reject a valid drop-in-only MachineConfig.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`
around lines 658 - 663, The test short-circuits when
verifyOpts.expectMCNilContent is true, returning immediately after checking
ignCfg.Storage.Files and thus skipping the subsequent drop-in assertions
(affecting TestCrioCredentialProviderConfigCreateEmpty and similar cases);
change the logic in the verification block so that you still assert
len(ignCfg.Storage.Files) == 0 when verifyOpts.expectMCNilContent is true but do
not return—allow execution to continue to the drop-in assertions (the code
referencing verifyOpts.expectMCNilContent and ignCfg.Storage.Files should be
updated to remove the early return and only perform the emptiness check).
pkg/controller/container-runtime-config/container_runtime_config_controller.go (2)

562-565: ⚠️ Potential issue | 🟠 Major

Don't requeue from the terminal branch.

queue.Forget(key) followed by queue.AddAfter(key, 1*time.Minute) makes maxRetries non-terminal again. Persistent failures will keep coming back forever instead of actually dropping out of the queue.

Suggested fix
 	utilruntime.HandleError(err)
 	klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err)
 	queue.Forget(key)
-	queue.AddAfter(key, 1*time.Minute)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`
around lines 562 - 565, The terminal/error branch currently calls
queue.Forget(key) and then queue.AddAfter(key, 1*time.Minute), which prevents
retries from becoming terminal; remove the requeue call so terminal errors are
dropped: in the block handling utilruntime.HandleError(err) and logging via
klog.V(2).Infof, keep queue.Forget(key) (and return if applicable) but remove
queue.AddAfter(key, 1*time.Minute) so the item truly stops being requeued.

1208-1228: ⚠️ Potential issue | 🟠 Major

Clear ConditionTypeValidated after conflicts are fixed.

This only writes the validated condition on the overlap path. Once the conflicting matchImages are removed, the old Validated=False / ConfigurationPartiallyApplied condition is left behind even though the next reconcile succeeds. Set the same condition type back to a success state in the no-overlap path as well.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`
around lines 1208 - 1228, The validated condition is only set when overlaps
exist; after overlaps are resolved you must clear/reset that
ConditionTypeValidated to success so old Validated=False /
ConfigurationPartiallyApplied isn't left behind. In the code path around
overlappedEntries and before calling ctrl.syncIgnitionConfig (referencing
overlappedEntries, ctrl.syncCRIOCredentialProviderConfigStatusOnly and
syncIgnitionConfig), add a call to
ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeValidated, <appropriate success reason constant e.g.
apicfgv1alpha1.ReasonConfigurationValidated or a matching success reason>,
"CRIOCredentialProviderConfig validated and no overlapping entries") so the
validated condition is marked successful when no overlaps are present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 658-663: The test short-circuits when
verifyOpts.expectMCNilContent is true, returning immediately after checking
ignCfg.Storage.Files and thus skipping the subsequent drop-in assertions
(affecting TestCrioCredentialProviderConfigCreateEmpty and similar cases);
change the logic in the verification block so that you still assert
len(ignCfg.Storage.Files) == 0 when verifyOpts.expectMCNilContent is true but do
not return—allow execution to continue to the drop-in assertions (the code
referencing verifyOpts.expectMCNilContent and ignCfg.Storage.Files should be
updated to remove the early return and only perform the emptiness check).

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The terminal/error branch currently calls
queue.Forget(key) and then queue.AddAfter(key, 1*time.Minute), which prevents
retries from becoming terminal; remove the requeue call so terminal errors are
dropped: in the block handling utilruntime.HandleError(err) and logging via
klog.V(2).Infof, keep queue.Forget(key) (and return if applicable) but remove
queue.AddAfter(key, 1*time.Minute) so the item truly stops being requeued.
- Around line 1208-1228: The validated condition is only set when overlaps
exist; after overlaps are resolved you must clear/reset that
ConditionTypeValidated to success so old Validated=False /
ConfigurationPartiallyApplied isn't left behind. In the code path around
overlappedEntries and before calling ctrl.syncIgnitionConfig (referencing
overlappedEntries, ctrl.syncCRIOCredentialProviderConfigStatusOnly and
syncIgnitionConfig), add a call to
ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeValidated, <appropriate success reason constant e.g.
apicfgv1alpha1.ReasonConfigurationValidated or a matching success reason>,
"CRIOCredentialProviderConfig validated and no overlapping entries") so the
validated condition is marked successful when no overlaps are present.

In `@test/framework/envtest.go`:
- Around line 119-121: NewTestEnv installs CRIOCredentialProviderConfig but the
shared helpers CheckCleanEnvironment, CleanEnvironment, and CreateObjects don't
handle that type; update those helpers to include CRIOCredentialProviderConfig
in their lists/operations so envtest correctly creates, verifies absence, and
deletes these objects. Specifically, add the CRIOCredentialProviderConfig
GVK/Kind into the type lists or switch cases used by CreateObjects, ensure
CheckCleanEnvironment also checks for zero instances of
CRIOCredentialProviderConfig, and have CleanEnvironment remove any
CRIOCredentialProviderConfig resources found so tests don't leak them between
cases.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 41d65ca0-f596-478f-a2ac-0897802b2836

📥 Commits

Reviewing files that changed from the base of the PR and between d1eef7d and 5d64533.

📒 Files selected for processing (13)
  • install/0000_80_machine-config_00_rbac.yaml
  • manifests/machineconfigcontroller/clusterrole.yaml
  • pkg/apihelpers/apihelpers.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
  • pkg/controller/container-runtime-config/helpers.go
  • pkg/controller/container-runtime-config/helpers_test.go
  • pkg/daemon/constants/constants.go
  • templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
  • templates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yaml
  • templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
  • templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
  • test/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
  • pkg/apihelpers/apihelpers.go
  • install/0000_80_machine-config_00_rbac.yaml
  • templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
  • manifests/machineconfigcontroller/clusterrole.yaml

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from 5d64533 to 137ce18 Compare March 9, 2026 22:16
@QiWang19
Copy link
Member Author

/verified by @QiWang19

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 10, 2026
@openshift-ci-robot
Copy link
Contributor

@QiWang19: This PR has been marked as verified by @QiWang19.

Details

In response to this:

/verified by @QiWang19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@QiWang19
Copy link
Member Author

@saschagrunert @haircommander PTAL

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from 137ce18 to c8f5763 Compare March 11, 2026 17:37
@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Mar 11, 2026
@haircommander
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 11, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 11, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: haircommander, QiWang19
Once this PR has been reviewed and has the lgtm label, please assign cheesesashimi for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@QiWang19
Copy link
Member Author

/retest-required

ctrl.syncCRIOCredentialProviderConfigStatusOnly(err, apicfgv1alpha1.ConditionTypeMachineConfigRendered, apicfgv1alpha1.ReasonMachineConfigRenderingFailed, "could not generate CRIOCredentialProvider Ignition config: %v", err)
return err
}
if len(overlappedEntries) > 0 {
Copy link
Member

@saschagrunert saschagrunert Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When overlaps exist, this sets ConditionTypeValidated to False with ReasonConfigurationPartiallyApplied. If the user later removes the conflicting entries from the CR, no code path resets this condition back to True. The stale warning persists indefinitely.

Do we need to add an else branch (or an unconditional call after the loop) that sets ConditionTypeValidated to True when len(overlappedEntries) == 0?

builtInLabelKey = "machineconfiguration.openshift.io/mco-built-in"
configMapName = "crio-default-container-runtime"
forceSyncOnUpgrade = "force-sync-on-upgrade"
genericCredProviderConfigPath = "/etc/kubernetes/credential-providers/generic-credential-provider.yaml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: kubeletCrioImageCredProviderConfPath duplicates constants.KubeletCrioImageCredProviderConfPath from pkg/daemon/constants/constants.go. Consider using the constants package instead of redefining the value here.

@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from c8f5763 to c953926 Compare March 12, 2026 18:38
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 12, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 12, 2026

New changes are detected. LGTM label has been removed.

Signed-off-by: Qi Wang <qiwan@redhat.com>
@QiWang19 QiWang19 force-pushed the addcriocpcontroller branch from c953926 to 18b41eb Compare March 12, 2026 20:38
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2026

@QiWang19: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op-2of2 b4132ef link true /test e2e-gcp-op-2of2
ci/prow/e2e-openstack 18b41eb link false /test e2e-openstack

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants