Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
247 changes: 247 additions & 0 deletions enhancements/olm/optional-serviceaccount-field.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
---
title: olmv1-optional-serviceaccount-field
authors:
- "@rashmigottipati"
reviewers:
- "@grokspawn"
- "@joelanford"
- "@trgeiger"
approvers:
- "@joelanford"
api-approvers:
- "@everettraven"
creation-date: 2025-11-25
last-updated: 2025-11-25
tracking-link:
- https://issues.redhat.com/browse/OPRUN-4144
replaces:
superseded-by:
---

# OLMv1: Make ServiceAccount Field Optional in the ClusterExtension API

## Summary

One of the core design principles of OLMv1 is Secure By Default. This enhancement aims at upholding and also improving upon that principle while addressing a usability issue.

The proposal is to make the `spec.serviceAccount` field in the ClusterExtension API optional in Tech Preview. For extensions without a ServiceAccount, OLMv1 uses a synthetic identity with zero permissions and relies on Kubernetes impersonation, allowing administrators to explicitly grant the necessary privileges. For extensions with a ServiceAccount, OLMv1 continues to use token based authentication, preserving backward compatibility.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For extensions with a ServiceAccount, OLMv1 continues to use token based authentication, preserving backward compatibility.

Why can you not switch the existing to impersonation? From a permissions perspective OLM would need to be able to impersonate service accounts. Does it not have that permission today?

Copy link
Member Author

@rashmigottipati rashmigottipati Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OLM already has a mechanism to impersonate ServiceAccounts, and this is implemented behind the SyntheticPermissions feature gate. The proposal retained the existing token based auth for backwards compatibility, but from a permissions perspective it's possible to fully switch to impersonation and remove all token based usage.

For now, we will be switching to impersonation only in TP to ensure it works well and is viable before fully switching to impersonation and eliminating the token based route.


## Motivation

Currently, the `spec.serviceAccount` field is required on every `ClusterExtension`, which creates usability and security challenges.

From a usability standpoint, users must understand the exact permissions their extension requires, create a corresponding ServiceAccount, and configure ClusterRoleBindings appropriately. Ensuring that the service account has the correct permissions is a manual and often complex process, leading to failed installations and a frustrating experience, particularly for new users. User feedback indicates a strong preference to avoid configuring RBAC for each ClusterExtension, with some users resorting to granting cluster admin privileges just to satisfy the requirement.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could you make this process less frustrating for users so that the least-privilege approach is not frustrating?

I've reviewed the docs that explain how this is done. It involves many steps including extracting manifests and merging different rules to create the appropriate Roles and ClusterRoles required.

It looks like it could be scripted. Why is a CLI utility (Extension of existing OLM CLI) not on the cards to automate this process and spit out the required least-privilege RBAC for the user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that the existing process can be complex and frustrating for users trying to follow a least privilege approach. One way to make this process less frustrating is by making the spec.serviceAccount field optional and have two sets of extensions:

  • not requiring permissions can run with synthetic identity with zero permissions, removing the need to create SA and configure RBAC
  • requiring permissions, where the cluster admin can grant via RoleBindings and ClusterRoleBindings, or even synthetic user bindings keeping the least privilege design intact

Regarding scripting, I think in the future it would be possible to provide a CLI utility to generate the Roles/ClusterRoles and necessary bindings. It would require redesigning of the existing permission model. For now, it's outside of the scope of what this EP is trying to solve, as the main focus is to remove the mandatory SA requirement and ensure a safe default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not requiring permissions can run with synthetic identity with zero permissions, removing the need to create SA and configure RBAC

How can a CE require zero permissions? IIUC the synthetic identity is used to create all of the resources for the CE is it not? So it will need to be able to have permissions for deployments and other resources within the CE bundle?

Regarding scripting, I think in the future it would be possible to provide a CLI utility to generate the Roles/ClusterRoles and necessary bindings.

If you did this first, do you still believe you'd need this EP?

I think in discussion with Joe I've understood better what we are trying to achieve in this EP since I reviewed it yesterday. But AFAICT this EP currently uses the argument of "RBAC is hard to configure, so folks are doing highly privileged things" as a motivation to implement the synthetic groups.

If you instead were using an argument of "even if RBAC was easy to set up, some folks would still want to apply blanket permissions to a group and not care about individual CEs", I think it would create a stronger argument for why this feature is important


From a security perspective, the complexity of correctly configuring RBAC frequently drives users toward over privileged solutions, such as granting cluster-admin access to satisfy the installation requirements. This behavior directly conflicts with OLMv1’s principle of being secure by default and introduces unnecessary risk to the cluster.

Before arriving at this design, we evaluated simpler alternatives such as removing the field entirely and implicitly using a cluster-admin service account, or making the field optional with cluster-admin as the default. While these approaches improve ease of use, they conflict with the principle of least privilege.

By making the spec.serviceAccount field optional and introducing synthetic identities with zero permissions by default, this enhancement provides a safer and simpler installation experience. It preserves backward compatibility and still allows the use of custom ServiceAccounts when fine-grained control is needed, aligning usability with the principle of least privilege.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to this point I'm not seeing how this changes the complexity for an end user.

In the old world:

  • They create a service account
  • They have to create Roles/ClusterRoles and the appropriate bindings to add permissions to the service account
  • They provide the service account to the ClusterExtension to use

The second bullet is the complex part

In the new world:

  • They have to create Roles/ClusterRoles and the appropriate bindings to add permissions to the synthetic service account

The complex part still exists?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree with your statement that the complex part is creating Roles/ClusterRoles and the corresponding bindings. The complex part still exists today, but only for extensions that require permissions, and this proposal does not eliminate that.

Whats changing is that, not all extensions would be forced to go through that complexity.

So in the old world:
Every extension must:

  • create a service account
  • create Roles/ClusterRoles
  • bind them
  • provide the SA to the ClusterExtension

In the new world:
Only extensions that need permissions require RBAC
So if an extension does not need any permissions, users dont have to do the below:

  • create service account
  • create any RBAC
  • create the necessary bindings
  • provide SA to the extension

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only extensions that need permissions require RBAC

Can you provide an example of an extension that does not require any permissions? I'm struggling to think what that would look like


### User Stories

- As a cluster admin, I want extensions to run with zero privileges by default, so that the cluster remains secure unless I explicitly grant permissions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current world, I create a service account and bind nothing to it to achieve this, right?

- As an extension author, I want to retain the ability to use custom ServiceAccounts for fine grained RBAC, allowing my extensions to operate with the privileges they need
- As a cluster admin, I want to grant permissions to multiple extensions using group bindings to apply the same permissions to all extensions without a ServiceAccount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This user story I'm not quite following. The Why part of the user story is missing, can you expand?

As a <persona>, I would like <something>, so that I can <achieve some goal>

(Nit, none of your user stories follow this format currently)


### Goals

- Make `spec.serviceAccount` optional in Tech Preview
- Ensure extensions run with zero permissions by default when no ServiceAccount is provided, using Kubernetes impersonation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does an extension get installed and run if there are no permissions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OLM impersonates a synthetic identity with zero permissions when the field is unset. So any extensions that dont need API access can be installed and run without any RBAC setup. For extensions that need permissions, users would have to bind RBAC to the synthetic identity.

The extension itself does not need any permissions to be installed and run. OLM handles that.

- Introduce synthetic identities for ClusterExtensions without a ServiceAccount
- Preserve existing behavior (token based approach) when a ServiceAccount is specified, maintaining backward compatibility
- Simplify RBAC management for new users by allowing permissions to be granted via standard ClusterRoleBindings and group bindings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already exists today? This is exactly how users assign permissions to a service account?

What is a group binding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this already exists today and this is how users assign permissions to a service account.

This EP focuses on simplifying RBAC management in the sense of making it flexible and allowing extensions to run with zero permissions by default without requiring a service account.

Group binding is a standard kubernetes ClusterRoleBinding or RoleBinding but the subject being a group rather a user. We will be using olm:clusterextensions as the group. So any clusterrolebinding that references this group gives permissions to all the synthetic identities in the group.

- Provide clear documentation and examples for using optional ServiceAccounts, synthetic identities, and RBAC bindings

### Non-Goals

- Removing the ServiceAccount field
- Removing or replacing existing token based authentication
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this under the hood actually would be a good goal IMO, if we can make it backward compatible. (need to understand how OLM v1 is deployed and what permissions it has before we can continue this, anyone able to link me?)

- Auto generating RBAC based on extension requirements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

From what I understand of the complaints from users, this is the biggest part of their frustration and is what leads them to using over-privileged permissions

- Making any changes to stable/GA behavior; this enhancement affects only Tech Preview
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still have to consider how you move this EP to GA. We should never assume this is TP only


## Proposal

### How It Works

**When the `spec.serviceAccount` field is unset:**
- OLM impersonates a synthetic identity:
- **user**: `olm:clusterextension:<ceName>`
- **group:** `olm:clusterextensions`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so this is something we don't have today. This would allow a user to create one binding that applies the same set of permissions to the group only, vs having to bind the role to a number of service accounts.

This is nice, but promotes lazy RBAC management and will mean that cluster extensions are still over-privileged

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup this does not exist as of today.

And yes with the group binding, it's possible that cluster extensions to be over privileged as you say. Our recommendation to users would be to create synthetic user binding for extensions that require specific RBAC. I think the goal here is to provide a safe default while still allowing for finer control where needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm trying to think through the various cases we might see

  • Admin doesn't care about CE permissions, grants cluster admin to the olm:clusterextensions group. Now every CE is installed as cluster admin on this cluster
  • Admin does care about CE permissions, generates (with a future CLI) the relevant RBAC and applies it for a service account
    • If this is after we switch SAs to impersonation, no SA is actually needed and the SA is synthetic
    • Or they use the olm:clusterextension:<ceName> synthetic identity

Having two ways to do the same thing is generally not desirable, so the plan long term would be to deprecate the service account field and eventually remove it in favour of folks using the olm:clusterextension:<ceName> version?

What if I wanted to share the RBAC between multiple CEs in a namspace? I know that's not recommended, but I assume people still do it?

- The identity has zero permissions by default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As does a service account

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The difference here is that with a synthetic identity, extensions that don't need permissions can run without requiring users to create a SA and the necessary RBAC, simplifying the manual steps

- Administrators would explicitly grant permissions via standard Kubernetes RBAC (e.g., `ClusterRoleBinding`).
- No ServiceAccount resource needs to be created
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like the least of the concerns for OLM users?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it's just a side benefit of this approach


**When the `spec.serviceAccount` is **set:**
- OLM honors it and authenticates via the existing token based mechanism
- The specified `ServiceAccount` must exist and have appropriate RBAC rules
- Tokens are fetched and managed as usual
- Preserves the existing GA behavior and is fully backward compatible with existing extensions

---

### Installation Flows

**Creating an extension without ServiceAccount:**
1. User creates ClusterExtension without specifying `spec.serviceAccount`
2. OLM creates synthetic identity: user `olm:clusterextension:<ceName>`, group `olm:clusterextensions`
3. Installation fails (zero permissions)
4. User creates ClusterRoleBinding granting permissions to the synthetic identity
5. Installation succeeds automatically

**Creating an extension with ServiceAccount:**
1. User creates ServiceAccount with appropriate RBAC
2. User creates ClusterExtension with `spec.serviceAccount` set
3. OLM uses token-based authentication with that ServiceAccount (existing behavior)
4. Installation succeeds using ServiceAccount's permissions

### API Changes

The API structure remains the same, but the field becomes optional only when the SyntheticPermissions feature gate is enabled.
- Upstream exposes the field with omitempty.
- Behavior gating is done via the SyntheticPermissions feature gate.
- OpenShift applies its own TechPreviewNoUpgrade gating on the downstream CRD by adding the `+openshift:enable:FeatureSets=TechPreviewNoUpgrade` annotation to ensure the field is optional only in Tech Preview and remains required in GA.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to consider what happens when this is GA. Do you think there's any impact to existing users?


```go
// serviceAccount specifies the identity for OLM controller operations.
// Optional in Tech Preview.
//
// When OMITTED: OLM uses synthetic identity "olm:clusterextension:<ceName>"
// with ZERO permissions. Admins grant permissions via ClusterRoleBinding.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And rolebindings? Why cluster level? Not always needed surely?

//
// When SET: OLM uses existing token based authentication.
// The ServiceAccount must exist on the cluster with appropriate RBAC.
//
// +optional
ServiceAccount ServiceAccountReference `json:"serviceAccount,omitempty"`
```

#### CRD Generation Behavior:
- Default CRD (stable channel): Field remains required.
- Tech Preview CRD (experimental channel): Field becomes optional, with description updated to explain synthetic identity behavior.

#### RBAC Examples

Grant cluster-admin to specific extension:
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: sample-rolebinding
roleRef:
kind: ClusterRole
name: cluster-admin
subjects:
- kind: User
name: "olm:clusterextension:<ceName>"
```

Grant cluster-admin to all extensions in the same group:
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: sample-rolebinding
roleRef:
kind: ClusterRole
name: cluster-admin
subjects:
- kind: Group
name: "olm:clusterextensions"
```

#### Deletion Behavior

- **Extensions with a ServiceAccount:**
- Existing behavior applies; all resources owned by the CE are deleted, including associated RBAC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the service account and RBAC associated to the service account deleted? They are not owned by the ClusterExtension, they are owned by the cluster admin

- **Extensions without a ServiceAccount (synthetic identity):**
- No actual ServiceAccount exists, so OLM does not delete any RBAC bindings created by the admin, so admins are responsible for removing any ClusterRoleBindings or group bindings associated with synthetic identities if desired.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it should be for service accounts too, I'm surprised it isn't


### Implementation

- The synthetic identity behavior relies on the SyntheticPermissions feature gate in operator-controller.
- This gate is Tech Preview and must be enabled for extensions without a ServiceAccount

**When ServiceAccount is unset**, OLM uses Kubernetes impersonation via client-go:

```go
// Synthetic identity
impersonationConfig := rest.ImpersonationConfig{
UserName: fmt.Sprintf("olm:clusterextension:%s", ce.Name),
Groups: []string{"olm:clusterextensions"},
}
```

**When ServiceAccount is set**, OLM continues using the existing token-based authentication mechanism (no code changes for this path).

Controller changes: Add synthetic identity generation for unset case, configure impersonation when ServiceAccount is empty.

### Risks and Mitigations

1. Users expect instant installation
- Need to mitigate with clear error messages, copy-paste RBAC examples, documentation
2. Unfamiliar with synthetic identities
- Provide docs support for generating bindings
3. Overly permissive group bindings
- Document when to use group vs per extension bindings

### Benefits
This approach satisfies both security and usability: zero-permission default when unset, explicit privilege delegation, simplified installation for new users (one manual step of RBAC binding instead of ServiceAccount + RBAC), and preserves backward compatibility.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Service accounts have zero permissions by default
  • They require explicit privilege delegation
  • The act of creating a service account and binding to that is not particularly cumbersome when put in scope of the RBAC creation - we appear to solving the easy part not the complex part?


### Drawbacks

- Installation requires extra manual step of creating RBAC
- Learning curve for synthetic identities

Choosing this approach despite these drawbacks as there are more advantages: secure by default, better UX than ServiceAccount+RBAC, simplified implementation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the first two are real benefits given earlier comments.

On simplified implementation, I think we can improve the implementation of the service account token/impersonation story without having to make any breaking changes.


## Test Plan

- Unit tests around ClusterExtension creation for the new behavior to ensure that extensions without a ServiceAccount are correctly handled:
- Synthetic identity is generated correctly
- JSON serialization: ClusterExtension without a ServiceAccount does not include `spec.serviceAccount` in the output
- E2E tests that exercise Tech Preview ClusterExtensions without a ServiceAccount:
- Installation workflow succeeds after creating the appropriate RBAC for the synthetic identity
- Scenarios with some extensions using ServiceAccounts and the rest using synthetic identities
- Failure scenarios: installation fails gracefully when RBAC is missing or insufficient

## Graduation Criteria

This enhancement is introduced as Tech Preview. Promotion to GA will require:

- Ability to utilize the enhancement end to end
- End user documentation, relative API stability
- Sufficient e2e and unit test coverage
- Gather feedback from users rather than just developers
- No significant security concerns and bugs discovered during TP

## Upgrade / Downgrade Strategy

**Upgrading to Tech Preview**:
- Existing ClusterExtensions (with ServiceAccount set) continue using token based auth
- ServiceAccount field becomes optional for new ClusterExtensions
- New ClusterExtensions without ServiceAccount use impersonation
- No RBAC changes required for previously installed extensions
- No user action required in the upgrade scenario

**Downgrading from Tech Preview**:
- Before downgrading, all ClusterExtensions that rely on synthetic identities must have a ServiceAccount added
- RBAC bindings must also be applied to ensure extensions have the correct permissions post downgrade
- Without this step, downgrading could break extension installations, as older OLM versions require the `spec.serviceAccount` field to be set

## Alternatives

1. Remove SA field, and default to cluster-admin:
- Delegates full responsibility to OLM as the package manager
- OLM would manage all ClusterExtensions using its own elevated privileges
- Even though this solves the delegation usecase, it breaks the principle of least privilege as every extension would run with cluster-admin by default
2. Make SA field optional; honor if provided, default to OLM cluster-admin if not provided:
- Not secure by default
- Users could unintentionally grant cluster-admin privileges to extensions if they ignore setting the SA field as it's optional
3. Keep SA field required, only add impersonation
- This doesn't solve usability problem
4. Always use impersonation (even when SA is set)
- Breaks backward compatibility, requires more extensive migration and changes existing behavior
Comment on lines +241 to +244
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are basically the same right?

Does OLM have the ability (RBAC) to impersonate already? Or is that something that would need to be added?

If it needs to be added, is that something that would require cluster admin intervention, or does CVO handle OLMs RBAC?

Do you have data on how many people are using OLMv1 in the field? If we required a one time RBAC change (create a binding allowing impersonation of a certain service account), how many people would actually be affected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes OLM has the ability to impersonate already. It's implemented behind the SyntheticPermissions feature gate.

I don't have the exact stats on how many people are using olmv1. But I will try to find out more about this.
From what I know so far, the adoption appears to be relatively low.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understanding those statistics might help us make informed decisions about what we can and cannot change at this point. If only 0.1% of clusters are affected then we may argue we can make broader/more disruptive changes and help that 0.1% with the change for example

5. Do nothing
- Poor UX
- Security issues with binding to privileged SAs remain