feat: automated Vertex AI model discovery with feature-flag-gated availability

## Problem

Adding a new model to ACP requires code changes across 5+ files in 3 components (frontend dropdown, runner model map, tests), a CI build, and a release. This has happened repeatedly:

- #581 — Added Opus 4.6 (frontend + runner + tests)
- #611 — Fixed Opus 4.6 Vertex ID mapping
- #398 — Added Opus 4.5 (via Amber bot, same pattern)

IT controls which models are available in our Vertex AI project. When they enable a new model, we shouldn't need a code change to surface it to users. Model availability should be a runtime configuration concern, not a build artifact.

## Requirements

### 1. Automated model discovery

A GitHub Action (daily cron + manual trigger) that probes our Vertex AI project to determine which models are currently accessible. Google does not provide a "list available publisher models" API — the only reliable method is probing inference endpoints with minimal requests:

| Publisher | Endpoint | Available | Not available |
|-----------|----------|-----------|---------------|
| Anthropic (Claude) | `publishers/anthropic/models/{id}:rawPredict` | HTTP 200 or 400 | HTTP 404 |
| Google (Gemini) | `publishers/google/models/{id}:generateContent` | HTTP 200 or 400 | HTTP 404 |
| Google (Imagen) | `publishers/google/models/{id}:predict` | HTTP 200 or 400 | HTTP 404 |

This technique was validated against project `gcp-jboyer-san-gemini` / region `us-east5`. Cost is negligible (~$0.001 per full scan). The probe payloads are minimal (5 tokens max output).

The list of model IDs to probe should be maintained in a single manifest file. When new model families are announced, this file is the only thing that needs updating — and it does not require a release.

### 2. Feature-flag-gated model availability

Discovered models should be tied to Unleash feature flags following our existing convention (#653):

- Flag naming: `models.<model-slug>.enabled`
- Tagged `scope: workspace` so workspace admins can enable/disable per-workspace
- Newly discovered models should be **disabled by default** (opt-in, not opt-out)
- The GHA should auto-create flags for newly discovered models via the Unleash Admin API

### 3. Dynamic model serving (no hardcoded lists)

The frontend model dropdown and runner model-to-Vertex-ID mapping must be driven by runtime configuration, not hardcoded arrays. The backend should serve available models by combining:

- A model registry (metadata: display name, Vertex AI ID, publisher, sort order)
- Unleash flag evaluation (is this model enabled for this workspace?)

The frontend and runner consume this API instead of maintaining their own static lists.

### 4. Zero-release model enablement

After this is implemented, the lifecycle for a new model should be:

1. IT enables model in Vertex AI → nothing happens yet
2. GHA runs (daily) → discovers model → creates disabled Unleash flag + updates registry
3. Admin enables flag in Unleash → model appears in ACP
4. **No code changes. No releases. No PRs.**

## Scope

- **In scope:** Claude models (runner supports these today), plus discovery of Gemini/Imagen/embeddings for future use
- **Out of scope:** Runner support for non-Claude models (Gemini, Llama, Mistral) — these should be discovered but flagged off by default
- **Out of scope:** Multi-region probing — single configurable region for now
- **Runner constraint:** Only Claude models work via Claude Agent SDK. Non-Claude models in the dropdown without runner support would confuse users. Use `defaultEnabled: false` for non-Claude.

## Key Architecture Decisions to Make

- **Where to store model metadata** (Vertex ID, display name, publisher): ConfigMap, database, Unleash variants, or a committed JSON file? The registry needs to be updatable without a release.
- **How the GHA authenticates** to GCP (service account), Unleash (admin token), and optionally K8s (if updating a ConfigMap directly).
- **How the runner resolves Vertex IDs** dynamically — it currently uses a hardcoded dict. Options: fetch from backend API at startup, read from a mounted ConfigMap, or accept the full Vertex ID from the operator.

## References

- #653 — Unleash feature flags infrastructure
- #661 — Daily GHA pattern (Claude Agent SDK update) — good template for the workflow structure
- #581, #611, #398 — Examples of the current manual model addition process
- [Vertex AI model versions docs](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/learn/model-versions) — canonical list of model IDs and version suffixes

## Acceptance Criteria

- [ ] A GHA runs daily, probes Vertex AI, and reports which models are available
- [ ] New models discovered by the GHA get Unleash flags created automatically (disabled by default)
- [ ] Frontend model dropdown is populated from backend API, not hardcoded
- [ ] Runner resolves model-to-Vertex-ID mapping dynamically, not from a hardcoded dict
- [ ] Enabling a model flag in Unleash makes it appear in the frontend without any code change or deployment
- [ ] Disabling a model flag removes it from the frontend
- [ ] Existing models continue to work throughout the migration (backward compatible)
- [ ] The probe manifest (list of model IDs to check) is a single maintainable file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: automated Vertex AI model discovery with feature-flag-gated availability #667

Problem

Requirements

1. Automated model discovery

2. Feature-flag-gated model availability

3. Dynamic model serving (no hardcoded lists)

4. Zero-release model enablement

Scope

Key Architecture Decisions to Make

References

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Publisher	Endpoint	Available	Not available
Anthropic (Claude)	`publishers/anthropic/models/{id}:rawPredict`	HTTP 200 or 400	HTTP 404
Google (Gemini)	`publishers/google/models/{id}:generateContent`	HTTP 200 or 400	HTTP 404
Google (Imagen)	`publishers/google/models/{id}:predict`	HTTP 200 or 400	HTTP 404

feat: automated Vertex AI model discovery with feature-flag-gated availability #667

Description

Problem

Requirements

1. Automated model discovery

2. Feature-flag-gated model availability

3. Dynamic model serving (no hardcoded lists)

4. Zero-release model enablement

Scope

Key Architecture Decisions to Make

References

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions