doc: lb admin state design doc #9634

nilo19 · 2025-11-12T12:03:34Z

What type of PR is this?

/kind design

What this PR does / why we need it:

Add design doc for lb admin state control.

Which issue(s) this PR fixes:

Fixes #
Related: #9633

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

netlify · 2025-11-12T12:03:40Z

✅ Deploy Preview for kubernetes-sigs-cloud-provide-azure ready!

Name	Link
🔨 Latest commit	`787b7e3`
🔍 Latest deploy log	https://app.netlify.com/projects/kubernetes-sigs-cloud-provide-azure/deploys/69147799ddb1c60008987dc2
😎 Deploy Preview	https://deploy-preview-9634--kubernetes-sigs-cloud-provide-azure.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

k8s-ci-robot · 2025-11-12T12:03:43Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nilo19

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [nilo19]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

wonderyl · 2025-11-14T04:11:47Z

content/en/development/design-docs/load-balancer-admin-state-design.md

+
+### `PreemptScheduled` Event
+
+1. The events informer surfaces a `PreemptScheduled` warning for node `N`.


Is the event read from IMDS or API?

kubernetes api.

Are you talking about events of Kubernetes nodes? Is there an existing mechanism to pass preempt events from IMDS to events of nodes? https://learn.microsoft.com/en-us/azure/virtual-machines/windows/scheduled-events

why do we want to use imds instead of watching kubernetes events? cloud-node-manager is a light weight component for node init, designed to replace node controller in ccm, maybe we don't want to put many features into it.

robbiezhang · 2025-11-19T06:32:25Z

content/en/development/design-docs/load-balancer-admin-state-design.md

+### High-Level Flow
+
+1. Kubernetes adds the `node.kubernetes.io/out-of-service` taint to a node or emits a `PreemptScheduled` event indicating an imminent Spot eviction.
+2. For Spot events, cloud-provider-azure patches the node with `cloudprovider.azure.microsoft.com/draining=spot-eviction` (one-time) so the signal persists beyond controller restarts; manual application of this taint is treated the same way.


"cloudprovider.azure.microsoft.com/draining=spot-eviction"

is it a label or an annotation? why not asking the user to patch node with this directly instead of patching add taint or emit event?

A label. The user can patch this manually. But if we detect a spot eviction events, we would patch the node proactively and trigger the logic. We can trigger the loggic directly without patching the node, but that is not robust as the controller manager could restart during the middle, making the signal lost.

robbiezhang · 2025-11-19T06:37:33Z

content/en/development/design-docs/load-balancer-admin-state-design.md

+
+### High-Level Flow
+
+1. Kubernetes adds the `node.kubernetes.io/out-of-service` taint to a node or emits a `PreemptScheduled` event indicating an imminent Spot eviction.


it looks like "node.kubernetes.io/out-of-service" is a well-known taint
https://kubernetes.io/docs/reference/labels-annotations-taints/#node-kubernetes-io-out-of-service

However, it's not added by k8s native controllers, but manually by user. What's the suggested effect do you propose here?

do you expect the user/caller to drain the pods on the node right after tainting the node? Since it requires to call SLB control plane to set the AdminState=Down, it might fail or take long time. I think it's better to have the controller to signal back on the node indicating that the AdminState=Down is set for the node, so that the caller can continue to proceed to drain the node.

The user can manually taint the node. Besides, we taint the node when it's shutdown, and removes the taint when the node goes back.
I am a bit worried when the node quickly jumps between shutdown/ready state. In this case we may generate a lot of adminstate down requests.
Let me re-consider the logic here. How do you think we can prevent this case? @robbiezhang

Also, for the callback signal you mentioned, we can patch the node again after setting adminstate down, and clear this patch after setting back to None. Let me see how to refine the document.

robbiezhang · 2025-11-19T06:42:58Z

content/en/development/design-docs/load-balancer-admin-state-design.md

+
+- Introduce an `AdminStateManager` interface responsible for:
+  - Translating a Kubernetes node into all relevant backend pool IDs (leveraging existing helpers such as `reconcileBackendPoolHosts`, `getBackendPoolIDsForService`, and the multiple-SLB bookkeeping).
+  - Calling the Azure SDK (`LoadBalancerBackendAddressPoolsClient`) to update the specific backend address with `AdminState`.


what's the throttling limit for this API? does it allow concurrent updates on different backend addresses? think of the scenario that the upgrader taint multiple nodes at the same time, what's the latency do we expect to set the AdminState for all nodes? Is it sequential, or in parallel, or in batch?

robbiezhang · 2025-11-19T06:56:13Z

content/en/development/design-docs/load-balancer-admin-state-design.md

+  1. Fetch the node object from the informer cache and compute desired admin state (`Down` when tainted, `None` when cleared).
+  2. Resolve backend entries using existing pool-mapping helpers.
+  3. Issue Azure SDK calls immediately (within the same event) so the time between taint observation and traffic cutover is bounded only by ARM latency.
+  4. Record success/failure metrics and retry via the queue if needed.


the controller should emit event on failure so that we/cx can observe the events.

the controller should also add latency as metrics so that we can evaluate the effectiveness. This feature is useless if the latency is longer than 10s.

jackfrancis · 2025-11-19T16:12:37Z

content/en/development/design-docs/load-balancer-admin-state-design.md

+- Restore admin state to `None` when a node returns to service so steady-state behavior is unaffected.
+- Avoid additional user-facing APIs, annotations, or CLI switches.
+
+## Non-Goals


It looks like a non-goal is optimizing LB backend pool membership when an underlying node VM goes offline (hard down)? I assume the existing Azure LB backend pool polling (LB regularly polls backend pool IP addresses) is sufficient to automatically evict downed VMs from the backend pool?

do you mean lb health probe by "regularly polls"?

doc: lb admin state design doc

787b7e3

k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/design Categorizes issue or PR as related to design. labels Nov 12, 2025

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 12, 2025

k8s-ci-robot requested review from andyzhangx and jwtty November 12, 2025 12:03

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 12, 2025

wonderyl reviewed Nov 14, 2025

View reviewed changes

robbiezhang suggested changes Nov 19, 2025

View reviewed changes

jackfrancis reviewed Nov 19, 2025

View reviewed changes


		### `PreemptScheduled` Event

		1. The events informer surfaces a `PreemptScheduled` warning for node `N`.


		### High-Level Flow

		1. Kubernetes adds the `node.kubernetes.io/out-of-service` taint to a node or emits a `PreemptScheduled` event indicating an imminent Spot eviction.

doc: lb admin state design doc #9634

Are you sure you want to change the base?

doc: lb admin state design doc #9634

Conversation

nilo19 commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

netlify bot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for kubernetes-sigs-cloud-provide-azure ready!

Uh oh!

k8s-ci-robot commented Nov 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nilo19 commented Nov 12, 2025 •

edited

Loading

netlify bot commented Nov 12, 2025 •

edited

Loading