Skip to content

Conversation

@morvencao
Copy link
Member

@morvencao morvencao commented Aug 27, 2025

add gRPC server metrics.

@openshift-ci openshift-ci bot requested review from deads2k and qiujian16 August 27, 2025 08:27
Signed-off-by: morvencao <[email protected]>
@morvencao
Copy link
Member Author

/assign @qiujian16 @skeeey

@qiujian16
Copy link
Member

@morvencao this looks great, I think we also need some description on what value of metrics will reflect the service is degraded. Or as an operator, what metrics should be checked to determine whether the service is not healthy.

For example:
```
grpc_server_started_total{grpc_method="Publish",grpc_service="io.cloudevents.v1.CloudEventService",grpc_type="unary"} 3
grpc_server_started_total{grpc_method="Subscribe",grpc_service="io.cloudevents.v1.CloudEventService",grpc_type="server_stream"} 4
Copy link
Member

@skeeey skeeey Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for grpc_method="Subscribe",grpc_service="io.cloudevents.v1.CloudEventService", we get them from StreamServerInfo/UnaryServerInfo? so the method or service will not only cloudevents Publish/Subscribe, right?

Copy link
Member Author

@morvencao morvencao Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, not only cloudevents Publish/Subscribe, this is just an example.
for other grpc server, eg, cluster-proxy, the service and method will be different.

Signed-off-by: morvencao <[email protected]>
@morvencao
Copy link
Member Author

added metrics values for healthy/degraded case, along with a rough operator guide.
another look? @qiujian16

@qiujian16
Copy link
Member

/approve
good to me
@skeeey to have a final look?

@openshift-ci
Copy link

openshift-ci bot commented Aug 28, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: morvencao, qiujian16

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@skeeey
Copy link
Member

skeeey commented Aug 28, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Aug 28, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit 6fecf80 into open-cluster-management-io:main Aug 28, 2025
2 checks passed
@morvencao morvencao deleted the br_grpc_metrics branch August 28, 2025 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants