Skip to content

Commit 0a27c02

Browse files
committed
Addressed comments
1 parent 8808148 commit 0a27c02

File tree

1 file changed

+60
-133
lines changed

1 file changed

+60
-133
lines changed

enhancements/okd/okd-featureset.md

Lines changed: 60 additions & 133 deletions
Original file line numberDiff line numberDiff line change
@@ -83,10 +83,12 @@ This proposal introduces a new "OKD" feature set to the OpenShift API configurat
8383
#### Scenario 1: Installing a new OKD cluster
8484

8585
1. OKD cluster administrator initiates a cluster installation using `openshift-install` built for OKD
86-
2. The installer automatically sets the feature set to "OKD" in the cluster configuration
87-
3. The cluster installs with the OKD feature set enabled by default
88-
4. The cluster has access to TechPreview features while maintaining upgrade capability
89-
5. The administrator can upgrade the cluster to newer OKD versions without changing the feature set
86+
2. During cluster bootstrap, the Cluster Version Operator (CVO) starts up
87+
3. The CVO detects it is running on an OKD build (via build metadata)
88+
4. If no FeatureGate resource exists, the CVO assumes the OKD feature set
89+
5. If a FeatureGate resource exists with Default ("") feature set, the CVO automatically migrates it to "OKD" by patching the resource
90+
6. The cluster operates with the OKD feature set, providing access to select TechPreview features while maintaining upgrade capability
91+
7. The administrator can upgrade the cluster to newer OKD versions without changing the feature set
9092

9193
#### Scenario 2: Attempting to enable OKD feature set on OpenShift
9294

@@ -99,9 +101,9 @@ This proposal introduces a new "OKD" feature set to the OpenShift API configurat
99101

100102
1. OKD cluster administrator has a running cluster with `featureSet: OKD`
101103
2. Administrator attempts to change the feature set to "Default"
102-
3. The API validation rejects the change with error: "OKD may not be changed"
104+
3. The API validation rejects the change with error: "OKD cannot transition to Default"
103105
4. The cluster continues operating with the OKD feature set
104-
5. The administrator must reinstall the cluster if a different feature set is required
106+
5. The administrator must reinstall the cluster if the Default featureset is required
105107

106108
#### Scenario 4: Attempting to Change OKD to TechPreviewNoUpgrade or DevPreviewNoUpgrade
107109
1. OKD cluster administrator has a running cluster with `featureSet: OKD`
@@ -156,7 +158,7 @@ And various operator-specific CRDs across multiple API groups.
156158

157159
#### Standalone Clusters
158160

159-
The OKD feature set is specifically designed for standalone OKD clusters and is the primary use case for this enhancement. The feature set will be enabled by default during installation and cannot be changed afterwards.
161+
The OKD feature set is specifically designed for standalone OKD clusters and is the primary use case for this enhancement. The feature set will be enabled by default during installation and can be changed to TechPreviewNoUpgrade or DevPreviewNoUpgrade afterwards.
160162

161163
#### Single-node Deployments or MicroShift
162164

@@ -196,6 +198,24 @@ var AllFixedFeatureSets = []FeatureSet{
196198
}
197199
```
198200

201+
There is also a test to ensure that all featuregates enabled in the Default featureset are enabled in the OKD featureset
202+
```
203+
```
204+
// Check that all Default featuregates are in OKD
205+
missingInOKD := defaultEnabled.Difference(okdEnabled)
206+
207+
if missingInOKD.Len() > 0 {
208+
missingList := missingInOKD.List()
209+
sort.Strings(missingList)
210+
211+
t.Errorf("ClusterProfile %q: OKD featureset is missing %d featuregate(s) that are enabled in Default:\n - %s\n\nAll featuregates enabled in Default must also be enabled in OKD.",
212+
clusterProfile,
213+
missingInOKD.Len(),
214+
strings.Join(missingList, "\n - "),
215+
```
216+
217+
218+
```
199219
#### Validation Rules
200220

201221
The FeatureGate spec includes Kubernetes validation rules:
@@ -211,11 +231,12 @@ This ensures:
211231

212232
#### Platform Detection
213233

214-
The installer and cluster operators must be able to detect whether they are running on OKD vs OpenShift to:
215-
- Automatically enable the OKD feature set during installation on OKD clusters
234+
The CVO must be able to detect whether it is running on OKD vs OpenShift to:
235+
- Automatically enable the OKD feature set on OKD clusters
236+
- Automatically migrate existing OKD clusters from Default to OKD feature set
216237
- Prevent the OKD feature set from being enabled on OpenShift
217238

218-
This detection is typically done through:
239+
This detection is done through the `version.IsOKD()` function in the CVO, which typically checks:
219240
- Build tags during compilation (`scos` for OKD, `ocp` for OpenShift)
220241
- Cluster version metadata
221242
- Installation metadata persisted during cluster creation
@@ -277,6 +298,7 @@ Instead of creating a new OKD feature set, we could modify the TechPreviewNoUpgr
277298
- Documentation and user expectations would be confusing
278299
- API contracts would be violated (TechPreviewNoUpgrade explicitly blocks upgrades)
279300
- Difficult to maintain and reason about platform-specific behavior
301+
- Would require a reinstallation for an upgrade
280302

281303
### Alternative 2: Use CustomNoUpgrade for OKD
282304

@@ -288,6 +310,7 @@ Instead of creating a dedicated OKD feature set, use the existing CustomNoUpgrad
288310
- No clear differentiation between OKD and custom configurations
289311
- Does not provide a default, curated experience for OKD users
290312
- Makes it harder to manage and communicate what features are enabled on OKD
313+
- Would require a reinstallation for an upgrade
291314

292315
### Alternative 3: Create an OKDTechPreview Feature Set
293316

@@ -298,6 +321,16 @@ Create a separate OKDTechPreview feature set in addition to the OKD feature set
298321
- The OKD feature set can already include appropriate TechPreview features
299322
- OKD's role as a community distribution means users expect to adopt new features
300323
- Can be reconsidered in the future if the use case becomes clearer
324+
- May not support upgrading, which would force users to potentially reinstall if they want upgrades
325+
326+
### Alternative 4: Create a OKD clusterprofile
327+
328+
Instead of creating a OKD feature set, create a new Clusterprofile that would enable certain featuregates in addition to the default featuregates
329+
330+
**Rejected because:**
331+
- A Clusterprofile is outside the scope of our goal
332+
- Implementing a cluster profile is difficult and affects more components then is required
333+
- OKD featureset is a "featuregate profile" rather than a cluster profile
301334

302335
## Open Questions [optional]
303336

@@ -319,54 +352,25 @@ Create a separate OKDTechPreview feature set in addition to the OKD feature set
319352

320353
**Integration Tests:**
321354
- API server correctly rejects attempts to enable OKD on OpenShift clusters
322-
- API server correctly rejects attempts to change from OKD to other feature sets
355+
- API server correctly rejects attempts to change from OKD to default
323356
- FeatureGate custom resource can be created with OKD feature set on OKD clusters
324357
- Feature gates are correctly applied when OKD feature set is enabled
325358

326-
**E2E Tests:**
327-
- OKD cluster installs successfully with OKD feature set enabled by default
328-
- OKD cluster can be upgraded with OKD feature set enabled
329-
- Features enabled by the OKD feature set function correctly
330-
- OpenShift cluster installation/configuration rejects OKD feature set
331-
332359
**Upgrade Tests:**
333360
- OKD clusters with OKD feature set can upgrade from version N to N+1
334361
- Feature set remains OKD after upgrade
335362
- Feature gates are correctly maintained across upgrades
336363

337364
## Graduation Criteria
338365

339-
The OKD feature set will be introduced as a stable feature, not following the typical Dev Preview -> Tech Preview -> GA progression, because:
340-
341-
1. It is part of the OpenShift API contract
342-
2. The feature set mechanism is already GA
343-
3. This is adding a new value to an existing, stable enum
344-
4. OKD is already a mature distribution
345-
346-
### Initial Release
347-
348-
The OKD feature set will be considered stable when:
349-
350-
- [ ] All API changes are merged to openshift/api
351-
- [ ] CRD manifests are generated for all relevant API resources
352-
- [ ] Validation logic is implemented and tested
353-
- [ ] Installer changes to enable OKD feature set by default are implemented
354-
- [ ] CI jobs for OKD with the new feature set are passing
355-
- [ ] Documentation is updated to describe the OKD feature set
356-
357-
### Ongoing Requirements
358-
359-
- Maintain compatibility with feature gate framework changes
360-
- Keep CRD manifests in sync as new API versions are added
361-
- Document any new features added to the OKD feature set
362-
- Ensure CI continues to test OKD with the feature set enabled
366+
**N/A as this is OKD**
363367

364368
## Upgrade / Downgrade Strategy
365369

366370
### Upgrade Strategy
367371

368372
**OKD Clusters:**
369-
- Existing OKD clusters without the OKD feature set: During the upgrade to the first version supporting the OKD feature set, the feature set should be automatically enabled if the cluster is detected as OKD. This should be handled by the cluster-version-operator or similar component.
373+
- Existing OKD clusters without the OKD feature set: During the upgrade to the first version supporting the OKD feature set, the CVO automatically migrates the Default feature set to OKD during startup. The CVO detects the OKD build via `version.IsOKD()` and patches the FeatureGate resource accordingly.
370374
- OKD clusters with OKD feature set already enabled: No changes needed; the feature set persists across upgrades.
371375
- Upgrades are explicitly supported with the OKD feature set enabled.
372376

@@ -375,12 +379,7 @@ The OKD feature set will be considered stable when:
375379

376380
### Downgrade Strategy
377381

378-
**Downgrading from a version with OKD feature set to a version without:**
379-
- If an OKD cluster with the OKD feature set is downgraded to a version that does not recognize the OKD feature set:
380-
- The cluster-version-operator should handle the unknown feature set gracefully
381-
- Ideally, the cluster should continue operating but may log warnings about the unknown feature set
382-
- This scenario should be tested to ensure it does not break the cluster
383-
- If necessary, clusters may need to be reinstalled rather than downgraded
382+
**Downgrading from a version with OKD feature set to a version without is not supported**
384383

385384
**General downgrade considerations:**
386385
- Downgrades are generally not supported in OpenShift/OKD
@@ -389,51 +388,20 @@ The OKD feature set will be considered stable when:
389388

390389
### Migration Path for Existing OKD Clusters
391390

392-
OKD clusters deployed before the introduction of the OKD feature set will need a migration strategy:
393-
394-
**Option 1: Automatic migration during upgrade**
395-
- Detect OKD clusters using build metadata or cluster version
396-
- Automatically enable the OKD feature set during CVO upgrade
397-
- Log the change clearly for administrator awareness
391+
OKD clusters deployed before the introduction of the OKD feature set will be automatically migrated:
398392

399-
**Option 2: Manual migration**
400-
- Require administrators to manually set the OKD feature set
401-
- Provide clear documentation and tooling
402-
- May be more transparent but requires user action
393+
**Automatic migration during upgrade:**
394+
- The CVO detects OKD clusters using build metadata (`version.IsOKD()`)
395+
- During CVO startup, if the FeatureGate resource has Default ("") feature set, the CVO automatically patches it to "OKD"
396+
- The migration is logged clearly for administrator awareness
397+
- If the patch fails, the CVO logs a warning and continues with Default, retrying on next restart
398+
- After successful migration, the CVO restarts to cleanly apply the new feature set
403399

404-
**Recommendation:** Implement Option 1 (automatic migration) with clear logging and documentation. This provides the smoothest upgrade experience for OKD users.
400+
This automatic migration provides the smoothest upgrade experience for OKD users without requiring manual intervention.
405401

406402
## Version Skew Strategy
407403

408-
The OKD feature set introduces version skew considerations:
409-
410-
**Control Plane and Node Version Skew:**
411-
- Feature gates are primarily evaluated at the control plane (API server) level
412-
- Nodes respect feature gates propagated through the kubelet configuration
413-
- Standard OpenShift version skew policies apply (e.g., nodes can be N-2 versions behind control plane)
414-
- The OKD feature set does not introduce new version skew constraints beyond existing feature gate behavior
415-
416-
**Component Version Skew:**
417-
- All components must be aware of the OKD feature set enum value
418-
- Components from older versions that do not recognize "OKD" as a valid value may fail validation
419-
- This is mitigated by:
420-
- Synchronizing API changes across all components
421-
- Using CI to test component compatibility
422-
- Following standard OpenShift component versioning
423-
424-
**Operator Version Skew:**
425-
- Operators must handle clusters with the OKD feature set enabled
426-
- Operators should either:
427-
- Explicitly support the OKD feature set
428-
- Treat it equivalently to an appropriate existing feature set (likely Default + TechPreview)
429-
- Gracefully handle unknown feature sets
430-
431-
**API Client Version Skew:**
432-
- Clients using older API definitions may not recognize the OKD feature set value
433-
- This is acceptable as long as:
434-
- Clients do not actively reject unknown enum values
435-
- The API server continues to accept and persist the OKD value
436-
- Clients can read the raw value even if they don't understand it
404+
**We plan to deliver this as part of a single release so there will be no version skew.**
437405

438406
## Operational Aspects of API Extensions
439407

@@ -476,19 +444,13 @@ The OKD feature set itself does not introduce new SLIs, but relies on existing i
476444

477445
### Failure Modes
478446

479-
**Failure Mode 1: Invalid feature set value**
480-
- **Symptom:** API server rejects FeatureGate resource with validation error
481-
- **Impact:** Cluster administrators cannot modify the FeatureGate configuration
482-
- **Detection:** API server logs show validation errors; CLI commands return error messages
483-
- **Recovery:** Correct the feature set value to a valid option (Default, TechPreviewNoUpgrade, CustomNoUpgrade, OKD, or empty string)
484-
485-
**Failure Mode 2: Attempt to change OKD feature set to Default**
447+
**Failure Mode 1: Attempt to change OKD feature set to Default**
486448
- **Symptom:** API server rejects update with error "OKD cannot be transitioned to Default"
487449
- **Impact:** Cluster administrators cannot change the feature set from OKD to default
488450
- **Detection:** API server logs show validation errors; CLI commands return error messages
489451
- **Recovery:** This is expected behavior; cluster must be reinstalled if a the default featureset is required
490452

491-
**Failure Mode 3: Version skew in component awareness of OKD feature set**
453+
**Failure Mode 2: Version skew in component awareness of OKD feature set**
492454
- **Symptom:** Components fail to start or report degraded status due to unknown feature set value
493455
- **Impact:** Specific operators or components may not function correctly
494456
- **Detection:** Operator logs show errors about unknown feature set; operator conditions show Degraded=True
@@ -539,11 +501,6 @@ oc adm node-logs --role=master --path=kube-apiserver/audit.log | grep -i feature
539501

540502
**Important:** The OKD feature set cannot be disabled once enabled. This is by design to ensure cluster consistency.
541503

542-
**If OKD feature set must be removed:**
543-
1. Back up all critical cluster data and configurations
544-
2. Plan for cluster downtime
545-
3. Reinstall the cluster with the desired feature set
546-
4. Restore applications and data to the new cluster
547504

548505
### Graceful Degradation
549506

@@ -556,40 +513,10 @@ The OKD feature set validation is enforced at the API level:
556513

557514
### Impact on Cluster Health
558515

559-
**When OKD feature set is functioning correctly:**
560-
- No impact on cluster health indicators
561-
- Cluster operates normally with features enabled according to the OKD feature set definition
562-
563-
**When OKD feature set is configured incorrectly:**
564-
- API server prevents invalid configurations through validation
565-
- Cluster continues operating with last known good configuration
566-
- No automatic reconciliation or rollback occurs
567-
- Manual intervention required to correct configuration issues
516+
Stability depends on the features that are added outside the Default featureset. We need to be careful when we choose to not ruin the stability of the cluster.
568517

569518
## Infrastructure Needed [optional]
570-
571-
**CI Infrastructure:**
572-
- OKD build and test jobs must be updated to expect the OKD feature set by default
573-
- E2E test suites should include scenarios with the OKD feature set enabled
574-
- Upgrade test jobs for OKD should verify feature set persistence
575-
576-
**Build Infrastructure:**
577-
- No changes needed; existing OKD build infrastructure can accommodate this change
578-
- The `scos` build tag will continue to differentiate OKD from OpenShift builds
579-
580-
**Documentation:**
581-
- Update OKD installation documentation to explain the OKD feature set
582-
- Add troubleshooting guides for feature set related issues
583-
- Document the differences between OKD and OpenShift feature sets
584-
- Provide guidance on which features are enabled in the OKD feature set
585-
586-
**Repository Infrastructure:**
587-
- No new repositories required
588-
- Changes are made to existing openshift/api repository and will be vendored to other repos
589-
- The OpenShift Kubernetes repo will need changes
590-
- The Cluster Config Operator repo will need changes to allow the OKD featureset to allow upgrades
591-
- Generated CRD manifests will be committed to the repository
592-
519+
**N/A**
593520
## Implementation History
594521

595522
- 2025-08-12: Initial PR opened (https://github.com/openshift/api/pull/2451)

0 commit comments

Comments
 (0)