COO-1687: feat: migrate to EndpointSlice service discovery by jan--f · Pull Request #1028 · rhobs/observability-operator

jan--f · 2026-03-09T07:58:48Z

Prometheus Operator defaults to watching the deprecated Endpoints API for service discovery. Switch the operator's own ServiceMonitors to use EndpointSlice explicitly, which eliminates the deprecation log noise from the operator's internal components.

Changes:

Set serviceDiscoveryRole: EndpointSlice on the ServiceMonitors we own (observability-operator, health-analyzer, thanos-querier) so that prometheus-operator uses the EndpointSlice role for these jobs.
Add discovery.k8s.io/endpointslices to all Prometheus RBAC roles and ClusterRoles (alongside the existing endpoints permission) so that Prometheus can serve both kinds of ServiceMonitors simultaneously.
Add discovery.k8s.io/endpointslices to the korrel8r ClusterRole so the correlation tool can read both endpoint representations.
Add the corresponding kubebuilder markers and update the generated cluster role YAML and CSV.

The Prometheus CR's global serviceDiscoveryRole is intentionally left unset (defaulting to Endpoints) so that user-created ServiceMonitors continue to work without modification. Users can opt individual ServiceMonitors into EndpointSlice by setting serviceDiscoveryRole: EndpointSlice on them.

openshift-ci · 2026-03-09T07:58:55Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jan--f

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [jan--f]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2026-03-09T07:59:27Z

@jan--f: This pull request references COO-1687 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Prometheus Operator defaults to watching the deprecated Endpoints API for service discovery. Switch the operator's own ServiceMonitors to use EndpointSlice explicitly, which eliminates the deprecation log noise from the operator's internal components.

Changes:

Set serviceDiscoveryRole: EndpointSlice on the ServiceMonitors we own (observability-operator, health-analyzer, thanos-querier) so that prometheus-operator uses the EndpointSlice role for these jobs.

Add discovery.k8s.io/endpointslices to all Prometheus RBAC roles and ClusterRoles (alongside the existing endpoints permission) so that Prometheus can serve both kinds of ServiceMonitors simultaneously.

Add discovery.k8s.io/endpointslices to the korrel8r ClusterRole so the correlation tool can read both endpoint representations.

Add the corresponding kubebuilder markers and update the generated cluster role YAML and CSV.

The Prometheus CR's global serviceDiscoveryRole is intentionally left unset (defaulting to Endpoints) so that user-created ServiceMonitors continue to work without modification. Users can opt individual ServiceMonitors into EndpointSlice by setting serviceDiscoveryRole: EndpointSlice on them.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

jan--f · 2026-03-09T14:57:10Z

/retest

jan--f · 2026-03-12T08:18:08Z

/retest

Prometheus Operator defaults to watching the deprecated Endpoints API for service discovery. Switch the operator's own ServiceMonitors to use EndpointSlice explicitly, which eliminates the deprecation log noise from the operator's internal components. Changes: - Set serviceDiscoveryRole: EndpointSlice on the ServiceMonitors we own (observability-operator, health-analyzer, thanos-querier) so that prometheus-operator uses the EndpointSlice role for these jobs. - Add discovery.k8s.io/endpointslices to all Prometheus RBAC roles and ClusterRoles (alongside the existing endpoints permission) so that Prometheus can serve both kinds of ServiceMonitors simultaneously. - Add discovery.k8s.io/endpointslices to the korrel8r ClusterRole so the correlation tool can read both endpoint representations. - Add the corresponding kubebuilder markers and update the generated cluster role YAML and CSV. The Prometheus CR's global serviceDiscoveryRole is intentionally left unset (defaulting to Endpoints) so that user-created ServiceMonitors continue to work without modification. Users can opt individual ServiceMonitors into EndpointSlice by setting serviceDiscoveryRole: EndpointSlice on them. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Jan Fajerski <jan@fajerski.name>

…nitors The operator's self-monitoring ServiceMonitor and the health-analyzer ServiceMonitor are monitoring.coreos.com objects processed by the platform prometheus-operator on OpenShift, which we don't control. Setting serviceDiscoveryRole: EndpointSlice on them requires the platform Prometheus to have endpointslices access and the platform prometheus-operator to correctly generate TLS-aware scrape configs for the endpointslice role — neither of which is guaranteed across OCP versions. The thanos-querier ServiceMonitor (monitoring.rhobs) is handled by the obo-prometheus-operator we manage, so it retains the EndpointSlice setting safely. Fixes TestOperatorMetrics/metrics_ingested_in_Prometheus on OCP clusters. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

simonpasquier · 2026-03-12T13:34:53Z

/lgtm

PeterYurkovich · 2026-03-12T13:44:36Z

/cherry-pick release-1.4

openshift-cherrypick-robot · 2026-03-12T13:44:39Z

@PeterYurkovich: once the present PR merges, I will cherry-pick it on top of release-1.4 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-1.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-cherrypick-robot · 2026-03-12T14:31:08Z

@PeterYurkovich: #1028 failed to apply on top of branch "release-1.4":

Applying: feat: migrate to EndpointSlice service discovery
Using index info to reconstruct a base tree...
M	bundle/manifests/observability-operator.clusterserviceversion.yaml
Falling back to patching base and 3-way merge...
Auto-merging bundle/manifests/observability-operator.clusterserviceversion.yaml
CONFLICT (content): Merge conflict in bundle/manifests/observability-operator.clusterserviceversion.yaml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Patch failed at 0001 feat: migrate to EndpointSlice service discovery

Details

In response to this:

/cherry-pick release-1.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

* feat: migrate to EndpointSlice service discovery Prometheus Operator defaults to watching the deprecated Endpoints API for service discovery. Switch the operator's own ServiceMonitors to use EndpointSlice explicitly, which eliminates the deprecation log noise from the operator's internal components. Changes: - Set serviceDiscoveryRole: EndpointSlice on the ServiceMonitors we own (observability-operator, health-analyzer, thanos-querier) so that prometheus-operator uses the EndpointSlice role for these jobs. - Add discovery.k8s.io/endpointslices to all Prometheus RBAC roles and ClusterRoles (alongside the existing endpoints permission) so that Prometheus can serve both kinds of ServiceMonitors simultaneously. - Add discovery.k8s.io/endpointslices to the korrel8r ClusterRole so the correlation tool can read both endpoint representations. - Add the corresponding kubebuilder markers and update the generated cluster role YAML and CSV. The Prometheus CR's global serviceDiscoveryRole is intentionally left unset (defaulting to Endpoints) so that user-created ServiceMonitors continue to work without modification. Users can opt individual ServiceMonitors into EndpointSlice by setting serviceDiscoveryRole: EndpointSlice on them. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Jan Fajerski <jan@fajerski.name> * fix: revert serviceDiscoveryRole from monitoring.coreos.com ServiceMonitors The operator's self-monitoring ServiceMonitor and the health-analyzer ServiceMonitor are monitoring.coreos.com objects processed by the platform prometheus-operator on OpenShift, which we don't control. Setting serviceDiscoveryRole: EndpointSlice on them requires the platform Prometheus to have endpointslices access and the platform prometheus-operator to correctly generate TLS-aware scrape configs for the endpointslice role — neither of which is guaranteed across OCP versions. The thanos-querier ServiceMonitor (monitoring.rhobs) is handled by the obo-prometheus-operator we manage, so it retains the EndpointSlice setting safely. Fixes TestOperatorMetrics/metrics_ingested_in_Prometheus on OCP clusters. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> --------- Signed-off-by: Jan Fajerski <jan@fajerski.name> Co-authored-by: Jan Fajerski <jan@fajerski.name> Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

openshift-ci bot requested review from PeterYurkovich and machine424 March 9, 2026 07:58

openshift-ci bot added the approved label Mar 9, 2026

jan--f changed the title ~~feat: migrate to EndpointSlice service discovery~~ COO-1687: feat: migrate to EndpointSlice service discovery Mar 9, 2026

openshift-ci-robot added the jira/valid-reference label Mar 9, 2026

jan--f force-pushed the endpointslices-migration branch from de800c3 to bc8b9f7 Compare March 12, 2026 10:43

Jan Fajerski and others added 2 commits March 12, 2026 10:50

jan--f force-pushed the endpointslices-migration branch from bc8b9f7 to 3adb1fb Compare March 12, 2026 10:54

openshift-ci bot assigned simonpasquier Mar 12, 2026

openshift-ci bot added the lgtm label Mar 12, 2026

openshift-merge-bot bot merged commit cbd6ba3 into rhobs:main Mar 12, 2026
11 checks passed

PeterYurkovich mentioned this pull request Mar 12, 2026

[release-1.4] COO-1687: feat: migrate to EndpointSlice service discovery #1035

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

COO-1687: feat: migrate to EndpointSlice service discovery#1028

COO-1687: feat: migrate to EndpointSlice service discovery#1028
openshift-merge-bot[bot] merged 2 commits intorhobs:mainfrom
jan--f:endpointslices-migration

jan--f commented Mar 9, 2026

Uh oh!

openshift-ci bot commented Mar 9, 2026

Uh oh!

openshift-ci-robot commented Mar 9, 2026 •

edited by openshift-ci bot

Loading

Uh oh!

jan--f commented Mar 9, 2026

Uh oh!

jan--f commented Mar 12, 2026

Uh oh!

simonpasquier commented Mar 12, 2026

Uh oh!

PeterYurkovich commented Mar 12, 2026

Uh oh!

openshift-cherrypick-robot commented Mar 12, 2026

Uh oh!

Uh oh!

openshift-cherrypick-robot commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

jan--f commented Mar 9, 2026

Uh oh!

openshift-ci bot commented Mar 9, 2026

Uh oh!

openshift-ci-robot commented Mar 9, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jan--f commented Mar 9, 2026

Uh oh!

jan--f commented Mar 12, 2026

Uh oh!

simonpasquier commented Mar 12, 2026

Uh oh!

PeterYurkovich commented Mar 12, 2026

Uh oh!

openshift-cherrypick-robot commented Mar 12, 2026

Uh oh!

Uh oh!

openshift-cherrypick-robot commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

openshift-ci-robot commented Mar 9, 2026 •

edited by openshift-ci bot

Loading