|
| 1 | +## Release Signoff Checklist |
| 2 | + |
| 3 | +- [ ] Enhancement is `implementable` |
| 4 | +- [ ] Design details are appropriately documented from clear requirements |
| 5 | +- [ ] Test plan is defined |
| 6 | +- [ ] Graduation criteria for dev preview, tech preview, GA |
| 7 | +- [ ] User-facing documentation is created in [website](https://github.com/open-cluster-management-io/open-cluster-management-io.github.io/) |
| 8 | + |
| 9 | +## Summary |
| 10 | + |
| 11 | +This enhancement introduces a global mechanism for defining and managing namespaces on Managed Clusters directly from |
| 12 | +the hub. Namespaces can be associated with a ManagedClusterSet, enabling centralized application of Kubernetes access |
| 13 | +controls through ClusterPermissions. Both the namespaces and their access controls are bound and managed by the |
| 14 | +ManagedClusterSet, and applied consistently to all Managed Clusters in the ManagedClusterSet. This provides |
| 15 | +multi-cluster governance and improves security consistency across fleets. |
| 16 | + |
| 17 | +The solution extends the existing ManagedClusterSet API to include namespace configuration and leverages the |
| 18 | +registration agent on each managed cluster to ensure namespace consistency. Each managed namespace is labeled and |
| 19 | +annotated to track its management state and support multiple hub scenarios. |
| 20 | + |
| 21 | +## Motivation |
| 22 | + |
| 23 | +Currently, creating consistent namespaces across multiple clusters in a ManagedClusterSet requires manual coordination |
| 24 | +using ManifestWork and ClusterPermission resources. This approach is error-prone, lacks automation, and makes it |
| 25 | +difficult to maintain consistency as the fleet scales. |
| 26 | + |
| 27 | +This enhancement addresses these challenges by introducing a declarative approach to namespace management at the |
| 28 | +ManagedClusterSet level. The global mechanism enables: |
| 29 | + |
| 30 | +- **Centralized Configuration**: Define namespaces once at the ManagedClusterSet level rather than per cluster |
| 31 | +- **Automatic Propagation**: Namespaces are automatically created and maintained across all clusters in the set |
| 32 | +- **Consistent Security**: ClusterPermissions can be applied uniformly across the managed namespaces |
| 33 | +- **Simplified Operations**: Reduce operational overhead for managing namespaces across large fleets |
| 34 | +- **Multi-tenancy Support**: Enable better isolation and resource organization for workloads like Virtual Machines |
| 35 | + |
| 36 | +This enhancement focuses specifically on creating uniform namespace management across Managed Clusters in a |
| 37 | +ManagedClusterSet, not managing all namespaces on the Managed Cluster. |
| 38 | + |
| 39 | +### Goals |
| 40 | + |
| 41 | +1. Update `ManagedClusterSet` API to support namespaces that span all clusters in a ManagedClusterSet (Global Namespaces) |
| 42 | +2. Associate ClusterPermissions to the ManagedClusterSet in terms of the Global Namespace |
| 43 | +3. Work with the existing RBAC tooling |
| 44 | + |
| 45 | +### Non-Goals |
| 46 | + |
| 47 | +1. Managing all namespaces on the Managed Cluster |
| 48 | +2. Enforcing workload (manifestwork) to live in the managed namespace. |
| 49 | + |
| 50 | +## Proposal |
| 51 | + |
| 52 | +This section provides detailed implementation specifics for the namespace management enhancement. |
| 53 | + |
| 54 | +### User Stories |
| 55 | + |
| 56 | +#### Story 1 |
| 57 | +Create a Global Namespace. This namespace will be created on all Managed Clusters in a ManagedClusterSet, and the |
| 58 | +appropriate ClusterPermissions will be applied. |
| 59 | + |
| 60 | +#### Story 2 |
| 61 | +Delete a Global Namespace. This will either remove the namespace on all Managed Clusters or leave it in place but stop |
| 62 | +managing it from the hub. |
| 63 | + |
| 64 | +### Implementation Details/Notes/Constraints [optional] |
| 65 | + |
| 66 | +Global Namespaces address a clear need, particularly for resources like virtual machines. They help guarantee |
| 67 | +consistency in the architecture and assist when moving VMs and displaying VMs. While this could be accomplished using |
| 68 | +ClusterPermissions and ManifestWork, it would not be automatic and the integration with ManagedClusterSet would be less |
| 69 | +clear. |
| 70 | + |
| 71 | +#### Controller Implementation |
| 72 | + |
| 73 | +The implementation involves two main components: |
| 74 | + |
| 75 | +1. **Hub Controller**: Monitors ManagedClusterSet resources for namespace configuration changes and propagates these |
| 76 | + changes to the status field of ManagedCluster resources that belong to the clusterset. |
| 77 | + |
| 78 | +2. **Registration Agent**: Runs on each managed cluster and monitors the ManagedCluster status for namespace |
| 79 | + configuration updates. When changes are detected, the agent creates, updates, or removes namespaces according to the |
| 80 | + specified configuration. |
| 81 | + |
| 82 | +#### Namespace Lifecycle Management |
| 83 | + |
| 84 | +The namespace lifecycle follows these stages: |
| 85 | + |
| 86 | +1. **Creation**: When a namespace is added to a ManagedClusterSet configuration, the hub controller updates all |
| 87 | + associated ManagedCluster statuses. The registration agent on each cluster detects this change and creates the |
| 88 | + namespace with appropriate labels and annotations. |
| 89 | + |
| 90 | +2. **Update**: Changes to namespace configuration (such as deletion strategy) are propagated through the same mechanism. |
| 91 | + |
| 92 | +3. **Removal**: When a namespace is removed from the ManagedClusterSet or when a cluster is removed from the set, the |
| 93 | + registration agent follows the configured deletion strategy (Keep or Delete). |
| 94 | + |
| 95 | +#### Label and Annotation Strategy |
| 96 | + |
| 97 | +Each managed namespace includes: |
| 98 | +- **Labels**: `clusterset.open-cluster-management.io/<hub-hash>: "true"` for quick filtering |
| 99 | +- **Annotations**: Detailed JSON configuration including clusterset name and deletion strategy for proper cleanup coordination |
| 100 | + |
| 101 | +This approach ensures that multiple hubs can manage different namespaces on the same cluster without conflicts. |
| 102 | + |
| 103 | +### Risks and Mitigation |
| 104 | + |
| 105 | +1. **Deleting a Global Namespace**: Mitigated by implementing a configurable deletion strategy (Keep/Delete) that allows |
| 106 | + administrators to choose whether namespaces should be preserved or removed when no longer managed. The |
| 107 | + annotation-based tracking ensures proper cleanup coordination. |
| 108 | + |
| 109 | +2. **Adopting existing namespaces**: When a namespace already exists on a cluster, the registration agent will adopt it |
| 110 | + by adding the appropriate labels and annotations without disrupting existing resources. A validation webhook can |
| 111 | + prevent conflicts with system namespaces. |
| 112 | + |
| 113 | +3. **System namespace protection**: Implementation includes safeguards to prevent management of critical system |
| 114 | + namespaces (kube-system, kube-public, etc.) through validation at both the API level and agent level. |
| 115 | + |
| 116 | +4. **Multi-hub conflicts**: The hub-hash based labeling system prevents conflicts when multiple OCM hubs attempt to |
| 117 | + manage the same cluster, ensuring each hub only manages its own namespace configurations. |
| 118 | + |
| 119 | +5. **Agent failures**: If the registration agent fails or becomes unavailable, namespace configurations remain in place. |
| 120 | + When the agent recovers, it reconciles the current state with the desired configuration, ensuring eventual consistency. |
| 121 | + |
| 122 | + |
| 123 | +## Design Details |
| 124 | + |
| 125 | +### API Changes |
| 126 | + |
| 127 | +#### ManagedClusterSet API Enhancement |
| 128 | + |
| 129 | +The `ManagedClusterSet` API is extended with a new field to specify namespaces that should be managed across all |
| 130 | +clusters in the set: |
| 131 | + |
| 132 | +```go |
| 133 | +// ManagedNamespaceConfig describes the configuration to manage namespaces on the clusters |
| 134 | +type ManagedNamespaceConfig struct { |
| 135 | + // namespaces is a list of namespaces that will be managed across the clusterset. |
| 136 | + // Each namespace will be created on all clusters belonging to this ManagedClusterSet. |
| 137 | + // +optional |
| 138 | + Namespaces []string `json:"namespaces,omitempty"` |
| 139 | + |
| 140 | + // lifecycleStrategy defines the strategy when a namespace is removed from management, |
| 141 | + // the cluster is removed from the clusterset, or the clusterset is removed. |
| 142 | + // Valid values are: |
| 143 | + // - "Keep": Preserve the namespace and its contents on the managed cluster |
| 144 | + // - "Delete": Remove the namespace and all its contents from the managed cluster |
| 145 | + // +kubebuilder:validation:Enum=Keep;Delete |
| 146 | + // +optional |
| 147 | + LifecycleStrategy string `json:"lifecycleStrategy,omitempty"` |
| 148 | +} |
| 149 | +``` |
| 150 | + |
| 151 | +#### ManagedCluster API Status Enhancement |
| 152 | + |
| 153 | +The `ManagedCluster` status is enhanced to reflect the namespace configuration inherited from associated clustersets: |
| 154 | + |
| 155 | +```go |
| 156 | +type ClusterSet struct { |
| 157 | + // name is the name of the clusterset |
| 158 | + Name string `json:"name,omitempty"` |
| 159 | + |
| 160 | + // namespaces is a list of namespaces that will be managed by the cluster, |
| 161 | + // inherited from the related clusterset. These namespaces will be created |
| 162 | + // and maintained by the registration agent on the managed cluster. |
| 163 | + Namespaces []string `json:"namespaces,omitempty"` |
| 164 | + |
| 165 | + // lifecycleStrategy defines the strategy when a namespace is removed from management, |
| 166 | + // the cluster is removed from the clusterset, or the clusterset is removed. |
| 167 | + // Valid values are "Keep" or "Delete". |
| 168 | + LifecycleStrategy string `json:"lifecycleStrategy,omitempty"` |
| 169 | + |
| 170 | + // conditions are the status conditions of the managed namespace |
| 171 | + Conditions []metav1.Condition `json:"conditions,omitempty"` |
| 172 | +} |
| 173 | +``` |
| 174 | + |
| 175 | +Since a `ManagedCluster` might belong to multiple clustersets, this will be a list in the status of the `ManagedCluster`. |
| 176 | + |
| 177 | +#### Controller Workflow |
| 178 | + |
| 179 | +1. **Hub Controller**: When a user updates the namespace configuration on a `ManagedClusterSet`, the hub controller |
| 180 | + identifies all `ManagedCluster` resources that belong to this clusterset and updates their status to reflect the new namespace configuration. |
| 181 | + |
| 182 | +2. **Registration Agent**: The registration agent on each managed cluster watches for changes in the `ManagedCluster` |
| 183 | + status and reconciles the actual namespace state with the desired configuration. |
| 184 | + |
| 185 | +3. **Conflict Resolution**: When a cluster belongs to multiple clustersets with overlapping namespace configurations, |
| 186 | + the registration agent merges the configurations and tracks the source of each namespace through annotations. |
| 187 | + |
| 188 | +Each namespace will have a label with key "clusterset.open-cluster-management.io/<hub hash>", indicating that the |
| 189 | +namespace is currently managed by an OCM hub. |
| 190 | + |
| 191 | +Each namespace will also have a set of annotations. |
| 192 | + |
| 193 | +An example of the namespace will be like: |
| 194 | + |
| 195 | +```yaml |
| 196 | +apiVersion: v1 |
| 197 | +kind: Namespace |
| 198 | +metadata: |
| 199 | + name: some-namespace |
| 200 | + annotations: |
| 201 | + clusterset.open-cluster-management.io/hub1: "[{clustesetName: global, deletionStrategy: keep}, {clustesetName: set1, deletionStrategy: delete}]" |
| 202 | + clusterset.open-cluster-management.io/hub2: "[{clustesetName: global, deletionStrategy: keep}]" |
| 203 | + labels: |
| 204 | + clusterset.open-cluster-management.io/hub1: true |
| 205 | + clusterset.open-cluster-management.io/hub2: true |
| 206 | +``` |
| 207 | +
|
| 208 | +This ensures that information is correctly recorded when there are multiple agents from different hubs and the cluster |
| 209 | +belongs to multiple clustersets. Labels help to quickly filter namespaces that are currently being managed by the hub, |
| 210 | +while annotations contain more detailed information to handle removal. |
| 211 | +
|
| 212 | +### Return status of the managed namespace |
| 213 | +
|
| 214 | +When the registration agent applies the namespace to the managed cluster, it also should show the result in the status |
| 215 | +of the managed cluster using the status condition to indicate: |
| 216 | +
|
| 217 | +1. a namespace is successfully applied to the cluster. |
| 218 | +2. whether the namespace is in the deleting state. |
| 219 | +
|
| 220 | +### Management Removal |
| 221 | +
|
| 222 | +The management of the namespace is removed when: |
| 223 | +1. The cluster is removed from the clusterset. |
| 224 | +2. The clusterset is deleted. |
| 225 | +3. The cluster is deleted. |
| 226 | +4. The namespace is explicitly removed from the ManagedClusterSet configuration. |
| 227 | +
|
| 228 | +The registration agent needs to check if the related labels or annotations should be removed or updated when the above |
| 229 | +cases occur. |
| 230 | +
|
| 231 | +#### Detailed Removal Process |
| 232 | +
|
| 233 | +1. **Configuration Change Detection**: The registration agent continuously monitors the ManagedCluster status for |
| 234 | + changes in namespace configuration. |
| 235 | +
|
| 236 | +2. **Cleanup Decision**: When a namespace is no longer managed, the agent checks the deletion strategy: |
| 237 | + - **Keep**: Removes OCM-specific labels and annotations but preserves the namespace and its contents |
| 238 | + - **Delete**: Removes the entire namespace and all its contents |
| 239 | +
|
| 240 | +3. **Multi-hub Coordination**: If multiple hubs manage the same cluster, the agent only removes labels and annotations |
| 241 | + specific to the hub that is no longer managing the namespace, preserving management by other hubs. |
| 242 | +
|
| 243 | +4. **Graceful Degradation**: If the hub becomes unavailable, the agent continues to maintain existing namespaces until |
| 244 | + connectivity is restored and new instructions are received. |
| 245 | +
|
| 246 | +### Open Questions [optional] |
| 247 | +
|
| 248 | +1. **Cross-cluster resource dependencies**: How should the system handle cases where resources in managed namespaces |
| 249 | + have dependencies across clusters in the same ManagedClusterSet? |
| 250 | +
|
| 251 | +2. **Namespace quotas and limits**: Should the global namespace management include propagation of ResourceQuotas and |
| 252 | + LimitRanges to ensure consistent resource constraints across all clusters? |
| 253 | +
|
| 254 | +3. **Integration with GitOps workflows**: How should this feature integrate with existing GitOps deployments that may |
| 255 | + already manage some of these namespaces? |
| 256 | +
|
| 257 | +4. **Monitoring and observability**: What metrics and events should be exposed to help administrators monitor the health |
| 258 | + and status of global namespace management across their fleet? |
| 259 | +
|
| 260 | +### Test Plan |
| 261 | +
|
| 262 | +Consider the following in developing a test plan for this enhancement: |
| 263 | +- integration tests should cover the case: |
| 264 | + - add multiple namespace into clustersets |
| 265 | + - remove namespace from clustersets |
| 266 | + - add/remove cluster from clusterset |
| 267 | + - delete clusterset |
| 268 | + - cluster is added to multiple clusterset |
| 269 | +
|
| 270 | +### Graduation Criteria |
| 271 | +
|
| 272 | +**Note:** *Section not required until targeted at a release.* |
| 273 | +
|
| 274 | +Define graduation milestones. |
| 275 | +
|
| 276 | +These may be defined in terms of API maturity, or as something else. Initial proposal |
| 277 | +should keep this high-level with a focus on what signals will be looked at to |
| 278 | +determine graduation. |
| 279 | +
|
| 280 | +Consider the following in developing the graduation criteria for this |
| 281 | +enhancement: |
| 282 | +
|
| 283 | +- [alpha] |
| 284 | + - namespace managed in clusterset is synced to all clusters in the clusterset |
| 285 | + - status of clusterset shows the status |
| 286 | +- [beta] |
| 287 | + - allowing to specify lifecycle strategy of each namespace. |
| 288 | + - enforcing workload namespace |
| 289 | +- [graduate] |
| 290 | + - performance tests are preformed |
| 291 | + - at least 3 consumers. |
| 292 | +
|
| 293 | +Clearly define what graduation means by either linking to the [API doc definition](https://kubernetes.io/docs/concepts/overview/kubernetes-api/#api-versioning), |
| 294 | +or by redefining what graduation means. |
| 295 | +
|
| 296 | +In general, we try to use the same stages (alpha, beta, stable), regardless how the functionality is accessed. |
| 297 | +
|
| 298 | +[maturity-levels]: https://git.k8s.io/community/contributors/devel/sig-architecture/api_changes.md#alpha-beta-and-stable-versions |
| 299 | +[deprecation-policy]: https://kubernetes.io/docs/reference/using-api/deprecation-policy/ |
| 300 | +
|
| 301 | +### Upgrade / Downgrade Strategy |
| 302 | +
|
| 303 | +Upgrade/Downgrade will not be impacted. |
| 304 | +
|
| 305 | +### Version Skew Strategy |
| 306 | +
|
| 307 | +The agent with old version will ignore the field, which should be fine |
| 308 | +
|
| 309 | +## Implementation History |
| 310 | +
|
| 311 | +Major milestones in the life cycle of a proposal should be tracked in `Implementation |
| 312 | +History`. |
| 313 | + |
| 314 | +## Drawbacks |
| 315 | + |
| 316 | +The idea is to find the best form of an argument why this enhancement should _not_ be implemented. |
| 317 | + |
| 318 | +## Alternatives |
| 319 | + |
| 320 | +### Using ManifestWork |
| 321 | + |
| 322 | +We could build a hub controller to create a manifestwork to apply namespaces to the managed cluster |
| 323 | +based on configurations in the clusterset. It has benefit that we do not need to implement |
| 324 | +label/annotation management for the namespace. The issue with this approach is: |
| 325 | +1. it introduces depedency that a feature in registration component depends on work component |
| 326 | +2. it introduces an additional manifestwork for each cluster. |
| 327 | + |
| 328 | +## Infrastructure Needed [optional] |
| 329 | + |
| 330 | +Use this section if you need things from the project. Examples include a new |
| 331 | +subproject, repos requested, github details, and/or testing infrastructure. |
| 332 | + |
| 333 | +Listing these here allows the community to get the process for these resources |
| 334 | +started right away. |
0 commit comments