-
Notifications
You must be signed in to change notification settings - Fork 21
Description
When starting the agent with the proper permissions, it will throw the following error in the logs and hang:
1 node_event_consumer.go:72] Error obtaining information about the agent pod [openshift-infra/hawkular-openshift-agent-qzg21]. err=User "system:serviceaccount:openshift-infra:hawkular-openshift-agent" cannot get pods in project "openshift-infra"
If the SA is given the proper permissions, the pod will still hang. If the pod is restarted it will startup properly.
By hanging like this, its left in a position where its indicating that its ready and running properly (status 1/1). At the very least, if it cannot properly continue, it should exit so that a new pod can be started in its place.
In this case, I believe the agent should wait and attempt to connect a few more times after some delay. We could even use a 'readiness probe' here to determine when the agent reaches a ready state.