Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions COMMIT_MSG.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
feat(manifests): production overlay for OpenShift (LVM, SCC, credentials, DB init)

Add and wire production overlay resources and patches for deploying the
Ambient Code Platform on OpenShift (e.g. chadsno2026) with LVM Storage,
restricted SCC, and shared PostgreSQL.

Storage (LVM)
- Add PVC patches for backend, postgresql, minio, and ambient-api-server-db
to use storageClassName: lvms-vg1 (LVM Storage) instead of default.
- See README-storage.md for LVMCluster and volume setup.

Credentials
- postgresql-credentials: db.host/postgresql, postgres/postgres123, db.name postgres.
- minio-credentials: MinIO credentials for state sync.
- unleash-credentials: DATABASE_URL to shared postgresql/unleash, API tokens,
default admin password, database-ssl.

ServiceAccounts and SCC (nonroot)
- postgresql-sa, minio-sa, ambient-api-server-db-sa: dedicated SAs for stateful
workloads so they can use nonroot SCC (no seccomp in patch; nonroot forbids it).
- postgresql-fsgroup-patch, minio-fsgroup-patch: fsGroup and runAsUser for RHEL
compatibility and volume permissions.
- ambient-api-server-db-scc-patch: runAsUser and securityContext for
ambient-api-server-db to run under nonroot.
- README-SCC.md: oc adm policy add-scc-to-user nonroot and rollout restart steps.

ambient-api-server
- ambient-api-server-wait-db-patch: add wait-for-db init container (pg_isready
loop) and keep migration init so API server starts only after DB is ready.

Unleash
- unleash-init-db-patch: init container (RHEL postgresql-16 image) that waits
for shared PostgreSQL, creates database "unleash" if missing, then verifies
connectivity to the unleash database (retries) before main container starts.
- kustomization: register credentials resources and all patches (PVC, SA, SCC,
wait-db, fsgroup, unleash-init-db); postgresql-json-patch removed (use
postgres:16 with fsGroup/runAsUser instead of RHEL image for shared Postgres).

Docs
- README-vertex.md: Vertex/Google Cloud notes (unchanged from existing content).

Files: 18 changed (README-SCC, README-vertex, *-sa, *-scc-patch, *-wait-db-patch,
kustomization, *-credentials, *-fsgroup-patch, pvc-patch-*, unleash-init-db-patch).
19 changes: 19 additions & 0 deletions components/manifests/overlays/production/README-SCC.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# OpenShift Security Context Constraints (SCC)

The **postgresql**, **minio**, and **ambient-api-server-db** deployments run as fixed UIDs (999 or 1000) for their data volumes. OpenShift's default **restricted-v2** SCC only allows UIDs in the namespace range, so these pods must use the **nonroot** SCC.

## One-time grant (cluster-admin)

After deploying, grant the nonroot SCC to all three service accounts:

```bash
oc adm policy add-scc-to-user nonroot -z postgresql -n ambient-code
oc adm policy add-scc-to-user nonroot -z minio -n ambient-code
oc adm policy add-scc-to-user nonroot -z ambient-api-server-db -n ambient-code
```

Then restart the deployments so pods are recreated with the new SCC:

```bash
oc rollout restart deployment/postgresql deployment/minio deployment/ambient-api-server-db -n ambient-code
```
76 changes: 76 additions & 0 deletions components/manifests/overlays/production/README-vertex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Vertex AI (ANTHROPIC_VERTEX_PROJECT_ID) on OpenShift

The production overlay uses Vertex AI by default (`USE_VERTEX=1`). You must set your GCP project ID and provide credentials.

## 1. Set your project ID and region

Patch the operator ConfigMap (replace with your values):

```bash
export KUBECONFIG=~/.kube/kubeconfig-noingress
oc patch configmap operator-config -n ambient-code --type merge -p '{
"data": {
"ANTHROPIC_VERTEX_PROJECT_ID": "YOUR_GCP_PROJECT_ID",
"CLOUD_ML_REGION": "us-central1"
}
}'
```

- **ANTHROPIC_VERTEX_PROJECT_ID**: Your Google Cloud project ID where Claude is enabled on Vertex AI.
- **CLOUD_ML_REGION**: Vertex AI region (e.g. `us-central1`, `europe-west1`, or `global` for some setups).

## 2. Create the GCP credentials secret

The operator and runner need a GCP service account key (or Application Default Credentials file) to call Vertex AI.

**Option A – Application Default Credentials (e.g. from your laptop):**

```bash
gcloud auth application-default login
oc create secret generic ambient-vertex -n ambient-code \
--from-file=ambient-code-key.json="$HOME/.config/gcloud/application_default_credentials.json" \
--dry-run=client -o yaml | oc apply -f -
```

**Option B – Service account key file:**

```bash
oc create secret generic ambient-vertex -n ambient-code \
--from-file=ambient-code-key.json=/path/to/your-service-account-key.json
```

The key file must be the JSON for a service account that has Vertex AI (and optionally Model Garden) permissions.

## 3. Restart the operator

So it picks up the updated ConfigMap and uses the new project ID for new sessions:

```bash
oc rollout restart deployment/agentic-operator -n ambient-code
oc rollout status deployment/agentic-operator -n ambient-code --timeout=120s
```

## 4. Verify

- In the UI, create or open a session; it should use Vertex AI for that project.
- Check operator logs:
`oc logs -l app=agentic-operator -n ambient-code --tail=50 | grep -i vertex`

## Optional: change the default in the overlay

To bake your project ID into the overlay (e.g. for Git-managed deploys), edit:

- `overlays/production/operator-config-openshift.yaml`

Set `ANTHROPIC_VERTEX_PROJECT_ID` and `CLOUD_ML_REGION` to your values, then re-apply the overlay.

## Disable Vertex AI (use direct Anthropic API)

To use an Anthropic API key instead of Vertex:

```bash
oc patch configmap operator-config -n ambient-code --type merge -p '{"data":{"USE_VERTEX":"0"}}'
oc rollout restart deployment/agentic-operator -n ambient-code
```

Then configure `ANTHROPIC_API_KEY` in the UI (Settings → Runner Secrets) per project.
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# ServiceAccount for ambient-api-server-db (PostgreSQL sidecar).
# Grant nonroot SCC so it can run as UID 999 (postgres):
# oc adm policy add-scc-to-user nonroot -z ambient-api-server-db -n ambient-code
apiVersion: v1
kind: ServiceAccount
metadata:
name: ambient-api-server-db
labels:
app: ambient-api-server
component: database
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Use dedicated SA (granted nonroot SCC) and add seccomp for PodSecurity.
# After deploy: oc adm policy add-scc-to-user nonroot -z ambient-api-server-db -n ambient-code
apiVersion: apps/v1
kind: Deployment
metadata:
name: ambient-api-server-db
spec:
template:
spec:
serviceAccountName: ambient-api-server-db
securityContext:
runAsNonRoot: true
fsGroup: 999
containers:
- name: postgresql
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 999
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Add init container to wait for ambient-api-server-db before running migration.
# Uses postgres image for pg_isready; DB user from secret (default: ambient).
apiVersion: apps/v1
kind: Deployment
metadata:
name: ambient-api-server
spec:
template:
spec:
initContainers:
- name: wait-for-db
image: postgres:16
command:
- sh
- -c
- |
until pg_isready -h ambient-api-server-db -p 5432 -U ambient 2>/dev/null; do
echo "waiting for ambient-api-server-db..."
sleep 2
done
echo "db ready"
- name: migration
image: quay.io/ambient_code/vteam_api_server:latest
imagePullPolicy: IfNotPresent
command:
- /usr/local/bin/ambient-api-server
- migrate
- --db-host-file=/secrets/db/db.host
- --db-port-file=/secrets/db/db.port
- --db-user-file=/secrets/db/db.user
- --db-password-file=/secrets/db/db.password
- --db-name-file=/secrets/db/db.name
- --alsologtostderr
- -v=10
volumeMounts:
- name: db-secrets
mountPath: /secrets/db
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false
capabilities:
drop:
- ALL
44 changes: 43 additions & 1 deletion components/manifests/overlays/production/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,21 +12,56 @@ namespace: ambient-code
# Manage this secret separately: oc apply -f github-app-secret.yaml -n ambient-code
resources:
- ../../base
- postgresql-credentials.yaml
- minio-credentials.yaml
- unleash-credentials.yaml
- ambient-api-server-route.yaml
- route.yaml
- backend-route.yaml
- public-api-route.yaml
- unleash-route.yaml
- operator-config-openshift.yaml
- ambient-api-server-db-sa.yaml
- postgresql-sa.yaml
- minio-sa.yaml

# Patches for production environment
# Storage: PVC patches use OpenShift local storage (lvms-vg1). See README-storage.md.
# Unleash: init container to create database (RHEL doesn't support init scripts)
# PostgreSQL: use RHEL image with proper env vars and mount paths (JSON patch)
patches:
- path: namespace-patch.yaml
target:
kind: Namespace
name: ambient-code
- path: pvc-patch-backend.yaml
target:
kind: PersistentVolumeClaim
name: backend-state-pvc
- path: pvc-patch-postgresql.yaml
target:
kind: PersistentVolumeClaim
name: postgresql-data
- path: pvc-patch-minio.yaml
target:
kind: PersistentVolumeClaim
name: minio-data
- path: pvc-patch-ambient-api-server-db.yaml
target:
kind: PersistentVolumeClaim
name: ambient-api-server-db-data
- path: ambient-api-server-db-scc-patch.yaml
target:
group: apps
kind: Deployment
name: ambient-api-server-db
version: v1
- path: ambient-api-server-wait-db-patch.yaml
target:
group: apps
kind: Deployment
name: ambient-api-server
version: v1
- path: frontend-oauth-deployment-patch.yaml
target:
kind: Deployment
Expand All @@ -35,12 +70,19 @@ patches:
target:
kind: Service
name: frontend-service
- path: postgresql-json-patch.yaml
# RHEL postgresql-json-patch removed: RHEL image needs /var/lib/pgsql created as root; use standard postgres:16 (runAsUser 999 + fsGroup 999) instead.
- path: postgresql-fsgroup-patch.yaml
target:
group: apps
kind: Deployment
name: postgresql
version: v1
- path: minio-fsgroup-patch.yaml
target:
group: apps
kind: Deployment
name: minio
version: v1
- path: unleash-init-db-patch.yaml
target:
group: apps
Expand Down
12 changes: 12 additions & 0 deletions components/manifests/overlays/production/minio-credentials.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# MinIO credentials for production
# Change root-password and secret-key in production
apiVersion: v1
kind: Secret
metadata:
name: minio-credentials
type: Opaque
stringData:
root-user: "admin"
root-password: "changeme123"
access-key: "admin"
secret-key: "changeme123"
21 changes: 21 additions & 0 deletions components/manifests/overlays/production/minio-fsgroup-patch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# OpenShift: SA (nonroot SCC) + fsGroup + PodSecurity (seccomp, allowPrivilegeEscalation, capabilities, runAsNonRoot)
# Grant nonroot SCC: oc adm policy add-scc-to-user nonroot -z minio -n ambient-code
apiVersion: apps/v1
kind: Deployment
metadata:
name: minio
spec:
template:
spec:
serviceAccountName: minio
securityContext:
fsGroup: 1000
containers:
- name: minio
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
8 changes: 8 additions & 0 deletions components/manifests/overlays/production/minio-sa.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# ServiceAccount for MinIO (runs as UID 1000).
# Grant nonroot SCC: oc adm policy add-scc-to-user nonroot -z minio -n ambient-code
apiVersion: v1
kind: ServiceAccount
metadata:
name: minio
labels:
app: minio
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# PostgreSQL credentials for production
# Change db.password and ensure unleash-credentials database-url matches
apiVersion: v1
kind: Secret
metadata:
name: postgresql-credentials
labels:
app: postgresql
app.kubernetes.io/name: postgresql
type: Opaque
stringData:
db.host: "postgresql"
db.port: "5432"
db.name: "postgres"
db.user: "postgres"
db.password: "postgres123"
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# OpenShift: SA (nonroot SCC) + fsGroup + PodSecurity (seccomp, allowPrivilegeEscalation, capabilities, runAsNonRoot)
# Grant nonroot SCC: oc adm policy add-scc-to-user nonroot -z postgresql -n ambient-code
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgresql
spec:
template:
spec:
serviceAccountName: postgresql
securityContext:
fsGroup: 999
containers:
- name: postgresql
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 999
9 changes: 9 additions & 0 deletions components/manifests/overlays/production/postgresql-sa.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# ServiceAccount for shared PostgreSQL (runs as UID 999).
# Grant nonroot SCC: oc adm policy add-scc-to-user nonroot -z postgresql -n ambient-code
apiVersion: v1
kind: ServiceAccount
metadata:
name: postgresql
labels:
app: postgresql
app.kubernetes.io/name: postgresql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ambient-api-server-db-data
spec:
storageClassName: lvms-vg1
Loading