Replies: 2 comments 2 replies
-
|
karpenter events show what pods are being moved due to consolidation |
Beta Was this translation helpful? Give feedback.
1 reply
-
|
you can't use PDB with single replica.. that would prevent the pod from ever being replaced |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
When a pod gets recreated in our cluster, we need to know if it was caused by Karpenter node termination or something else (application crash, normal deployment, etc). Currently there's no straightforward way to identify pods that were evicted due to Karpenter disruption.
Background
We run staging/sandbox environments with many single-replica pods to save costs. Karpenter frequently consolidates nodes due to low utilization, causing pods to be evicted and recreated. Some pods experience downtime during rescheduling.
When investigating a recreated pod, we cannot tell if:
Right now we use this manual investigation process:
We do this multiple times a week. It's time consuming and not scalable.
What we need
A way to identify which pods were evicted by Karpenter. This would help us:
Ideally, when a pod is recreated after Karpenter eviction, there would be some indicator (annotation, label, event, etc) that shows it was evacuated due to Karpenter disruption.
Questions
How do others track pods evacuated by Karpenter? Is there a simpler approach than manual log correlation?
We tried Kubernetes Events but they're node-focused and expire after an hour. Are we missing something?
Should Karpenter provide this visibility, or should users build it themselves? I don't think workarounds like SQS → Lambda → CloudWatch is a good idea.
Beta Was this translation helpful? Give feedback.
All reactions