-
Notifications
You must be signed in to change notification settings - Fork 4.3k
[Granular resource limits] Add support for granular resource quotas #8662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: norbertcyran The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Hi @norbertcyran. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
FYI: not ready for review yet |
9a781ed to
48db6ad
Compare
48db6ad to
475cac9
Compare
475cac9 to
e313251
Compare
e313251 to
080fd15
Compare
Ready now |
elmiko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code is looking good to me, i have a couple questions. i like the tests too.
| continue | ||
| } | ||
|
|
||
| if limitsLeft < resourceDelta*int64(nodeDelta) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not following the math here, could you explain what resourceDelta*int64(nodeDelta) is calculating?
i might be confused about nodeDelta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodeDelta is the number of nodes (of the same shape) to be added to the cluster, resourceDelta is the quantity of a specific resources in a node of that shape. For instance, if we want to add 3 nodes with 4 CPU each, resourceDelta*int64(nodeDelta) will evaluate to 12. This condition basically checks if adding 12 CPUs to the cluster would exceed the limit
Perhaps it would be cleaner to call these nodesToBeAdded and resourcesToBeAdded or something similar. However, I'm thinking about adding support for negative deltas later on to remove duplication in the scale down logic (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/scaledown/planner/planner.go#L164, https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/scaledown/resource/limits.go).
I can add some comments to clarify what deltas mean, unless you have other suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great, thank you for the explanation. it makes sense to me now.
Perhaps it would be cleaner to call these nodesToBeAdded and resourcesToBeAdded or something similar.
i like this, perhaps names that are more descriptive with what is planned next, but this would definitely help with readability.
I can add some comments to clarify what deltas mean, unless you have other suggestions?
i think changing the variable names would help, and i also like having more comments here. i think even something as brief as what you described here would be helpful.
| // NewQuotasTracker calculates resources used by the nodes for every | ||
| // quota returned by the Provider. Then, based on usages and limits it calculates | ||
| // how many resources can be still added to the cluster. Returns a Tracker object. | ||
| func (f *TrackerFactory) NewQuotasTracker(ctx *context.AutoscalingContext, nodes []*corev1.Node) (*Tracker, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a question of curiosity, is the intention that a new Tracker will be created on each scan interval of the core?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it will probably be created here, replacing the legacy logic: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go#L124
Performance-wise it's not ideal, but it's not very different from the current logic, except that the loop over nodes will be repeated over all quotas. Still, the complexity will be negligible compared to scheduling simulations and bin-packing. Ideally we'd have a goroutine updating the tracker state in the background, but that seems like a lot of effort and edge cases related to consistency. At this point, I would say it would be a premature optimization, but we might want to improve it in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, thank you for the explanation =)
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR is a part of granular resource limits initiative (#8703). It implements the foundation for the new resource quotas system. The legacy system supports only cluster-wide resource limits coming from the cloud provider. This PR introduces possibility to provide multiple quotas that can apply to different subset of nodes.
For now, the new package is not integrated with the rest of the codebase. This is done on purpose to safely ship the new system in smaller chunks. Therefore, this PR does not introduce any user-facing changes.
Which issue(s) this PR fixes:
Part of #8703.
Special notes for your reviewer:
This PR ended up larger than I expected. Caching of node deltas, support for storage and ephemeral storage, and integration with scale up and scale down will be implemented in the next PRs. See the proposal #8702 for more details.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: