feat(go/plugins/vertexai/modelgarden): Support Claude Opus 4.5, Llama, Mistral by cabljac · Pull Request #5175 · genkit-ai/genkit

cabljac · 2026-04-23T09:35:00Z

Extends the Vertex AI Model Garden Go plugin with new Anthropic, Meta Llama, and Mistral / Codestral models, plus a Dev UI sample that exercises them end-to-end.

Models

Anthropic (existing plugin, more models): claude-opus-4-5@20251101, claude-opus-4-1-20250805, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001, plus the previously supported Claude 3.5 / 3.7 / Opus 4 / Sonnet 4 entries.
Meta Llama (MaaS, new plugin): meta/llama-4-maverick-17b-128e-instruct-maas, meta/llama-4-scout-17b-16e-instruct-maas, meta/llama-3.3-70b-instruct-maas. Llama 4 variants register as Multimodal; Llama 3.3 70B is text-only. Routed through compat_oai against the Vertex MaaS OpenAI-compatible endpoint.
Mistral / Codestral (MaaS, new plugin): mistral-small-2503, mistral-medium-3, mistral-ocr-2505, codestral-2. No maintained Mistral-GCP Go SDK exists, and Vertex does not serve Mistral via the OpenAI-compatible chat completions endpoint — see Routing notes below.

Plugin scaffolding

New Llama and Mistral plugin types alongside the existing Anthropic, each with Init, Name, DefineModel(name, opts), and a top-level LlamaModel(g, id) / MistralModel(g, id) lookup, mirroring the existing Anthropic shape.
Shared resolveVertexMaasEnv helper for GOOGLE_CLOUD_PROJECT / GCLOUD_PROJECT / GOOGLE_CLOUD_LOCATION / GOOGLE_CLOUD_REGION resolution.
Shared provider = "vertexai" constant so every modelgarden plugin registers models under the same vertexai/<model> namespace.
Anthropic.DefineModel now takes the same mutex and initted guard as Llama.DefineModel / Mistral.DefineModel, so calling it before Init returns an explicit "not initialized" error instead of using a zero-value client.

Routing notes

Vertex's /endpoints/openapi/chat/completions works for Meta Llama but not for Mistral — Vertex returns 400 FAILED_PRECONDITION even with a subscribed project. Mistral is only served at per-model …/publishers/mistralai/models/<id>:rawPredict (and :streamRawPredict for SSE). The rawPredict response is already OpenAI-shaped JSON, so only the URL needs to change.

mistral_transport.go introduces an http.RoundTripper that wraps the oauth2 transport, intercepts outbound /chat/completions requests, reads model + stream from the body, rewrites the URL to the per-model rawPredict (or streamRawPredict) path, and delegates back. The body bytes are preserved (incl. tools, response_format, stream_options), GetBody is set for openai-go retries, and the model id is url.PathEscape-encoded to prevent path/query injection.

MistralModels keys are bare ids (e.g. mistral-small-2503), so the openai-go SDK already serialises the form rawPredict expects. MistralModel(g, id) and Mistral.DefineModel(name, opts) strip an optional mistralai/ prefix as a convenience so callers used to publisher-qualified ids keep working.

Llama is unchanged from the OpenAI-compat path; only Mistral takes the rawPredict detour.

Sample

go/samples/modelgarden registers four flows for Dev UI smoke testing:

opus45Flow — claude-opus-4-5@20251101
llamaFlow — meta/llama-3.3-70b-instruct-maas
mistralFlow — mistral-small-2503
codestralFlow — codestral-2

Each plugin is constructed with an explicit Location because Vertex MaaS regional availability differs per publisher (Anthropic in us-east5, Llama / Mistral in us-central1), so all four flows work simultaneously with just GOOGLE_CLOUD_PROJECT set.

Tests

mistral_transport_test.go — pure in-process tests covering rawPredict / streamRawPredict URL rewrites, publisher-prefix strip, non-chat passthrough, missing-model error, body field preservation (tools, response_format), GetBody population, and URL escaping of malformed model ids.
models_test.go — resolveVertexMaasEnv exercised for explicit args, primary / secondary env fallback, and panic paths.
internal_test.go — DefineModel on Anthropic / Llama / Mistral covered for the nil ai.ModelOptions branch (in addition to the existing pre-Init error path in define_model_test.go).
Coverage on the new code: mistral_transport.RoundTrip 89.7%, peekModelAndStream 100%, resolveVertexMaasEnv 100%. Package coverage rose from 32.6% to 48.8%; remaining 0% functions (Init, Name, model lookups) require GCP credentials and are exercised by the live tests.

gemini-code-assist

Code Review

This pull request introduces support for Meta Llama and Mistral/Codestral models in the Vertex AI Model Garden plugin by implementing new Llama and Mistral structures that utilize OpenAI-compatible endpoints. It also updates the Anthropic model list, including the addition of Claude 4.5 Opus and the deprecation of older versions. Key feedback includes addressing ineffective pointer receiver assignments in Init methods, resolving undefined provider variables that may cause compilation errors, and mitigating potential panics from risky type assertions when registering models.

cabljac · 2026-04-23T10:20:22Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for Meta Llama and Mistral/Codestral models to the Vertex AI Model Garden plugin. It includes new plugin implementations for Llama and Mistral that utilize OpenAI-compatible endpoints, along with corresponding live tests and model definitions. Additionally, the Anthropic model list was updated to include Claude 4.5 Opus and mark Claude 3 Haiku as deprecated. Review feedback identified a missing definition for the provider variable in the new files and a naming inconsistency between the plugin provider and the model namespace that could affect model discovery.

cabljac · 2026-05-08T09:37:06Z

/gemini review

gemini-code-assist

Code Review

This pull request expands the Vertex AI Model Garden plugin by adding support for Meta Llama and Mistral/Codestral models, implemented via OpenAI-compatible endpoints. It also updates the Anthropic model definitions to include Claude 4.5 Opus and marks Claude 3 Haiku as deprecated. Review feedback suggests correcting the capability flags for the Llama 3.3 70B model, which is currently incorrectly marked as multimodal, and refactoring the environment variable resolution logic to eliminate duplication across the new plugin files.

… Mistral support - Add claude-opus-4-5@20251101 to the existing Anthropic Model Garden list. - Add a new Llama plugin covering the three Meta Llama MaaS models (llama-4-maverick, llama-4-scout, llama-3.3-70b). The plugin reuses compat_oai.OpenAICompatible with an oauth2-wrapped HTTP client against the Vertex MaaS OpenAI-compatible endpoint. - Add a new Mistral plugin covering mistral-medium-3, mistral-ocr-2505, mistral-small-2503 and codestral-2. The JS parity path uses the native @mistralai/mistralai-gcp SDK, which has no Go equivalent, so this plugin also routes through the Vertex MaaS OpenAI-compatible endpoint. - Live tests for each new provider skip unless GOOGLE_CLOUD_PROJECT and GOOGLE_CLOUD_LOCATION are set.

… check Reassigning a pointer receiver inside Init does not update the caller's reference, so the block was a no-op that misleadingly implied nil-safety. Plugins are always registered via a non-nil pointer (genkit.WithPlugins(&Llama{...})), so removing it is safe.

… compat_oai namespaces Move the shared provider = "vertexai" constant out of anthropic.go, where it was coupled to an unrelated plugin, and into models.go alongside the other package-level shared state. Expand the comment to document the shared namespace design, the collision with googlegenai.VertexAI, and the next-major-version TODO. Also set l.oai.Provider / m.oai.Provider to the shared provider value rather than each plugin's own Name(). The embedded compat_oai is never registered directly, so changing its Provider has no external effect, but it makes compat_oai.ListActions / ResolveAction consistent with the static DefineModel(provider, ...) registrations.

…nv helper, guard Anthropic.DefineModel - Llama 3.3 70B Instruct is text-only on Vertex MaaS; switch its Supports to BasicText. Llama 4 Maverick/Scout remain Multimodal. - Centralise resolveVertexMaasEnv in models.go and use it from anthropic.go, llama.go, and mistral.go (Anthropic's inline ~20-line block collapses to one call). - Anthropic.DefineModel now acquires the mutex and checks initted, matching Llama/Mistral. Calling DefineModel before Init returned a zero-valued client; it now returns a "not initialized" error. - Add define_model_test.go covering DefineModel-before-Init for all three plugins.

…redict + Dev UI sample Vertex AI does not serve Mistral via the OpenAI-compatible /endpoints/openapi/chat/completions endpoint that compat_oai targets; that endpoint returns 400 FAILED_PRECONDITION even with a subscribed project. Mistral lives at /publishers/mistralai/models/<id>:rawPredict (or :streamRawPredict for SSE) and the rawPredict response is already OpenAI-shaped, so only the URL needs to change. Add mistralVertexTransport, an http.RoundTripper wrapping the oauth2 transport. It intercepts /chat/completions requests, reads the model field and stream flag from the body, and rewrites the URL to the per-model rawPredict / streamRawPredict path while leaving the body bytes intact. The inner oauth2 transport still adds the Bearer token, and compat_oai's existing OpenAI-shaped request building, SSE parsing, tool-call handling, and response conversion all keep working. Re-key MistralModels with bare ids (mistral-small-2503, codestral-2, mistral-medium-3, mistral-ocr-2505) so the body the openai-go SDK emits already carries the bare id that rawPredict expects. MistralModel and Mistral.DefineModel strip an optional "mistralai/" prefix so callers used to publisher-qualified ids keep working without code changes. Add a Dev UI sample under go/samples/modelgarden that exercises Claude 3.5 Sonnet v2, Claude Opus 4.5, Llama 3.3 70B, Mistral Small 2503, and Codestral 2. The Anthropic flows now pass MaxTokens (required by the Anthropic plugin), and the Claude 3.5 model id includes its required @20241022 stamp. Tests: - mistral_transport_test.go: rawPredict and streamRawPredict URL rewrite, publisher-prefix strip, non-chat passthrough, missing-model error, body field preservation (tools, response_format), and GetBody population so openai-go retries replay the request. - mistral_live_test.go: add bare-id lookup subtest and a streaming subtest covering :streamRawPredict end-to-end.

…rawPredict path A model id from the request body lands directly in the rawPredict URL path. Without escaping, an id containing "/", "?", or "#" could inject extra path segments, query strings, or fragments into the outbound request. Pass it through url.PathEscape and add a test pinning that behavior.

…k edge cases, and nil opts Adds three cheap unit tests that push package coverage from 32.6% to 48.8% without touching live (credential-gated) paths: - models_test.go: resolveVertexMaasEnv now exercised for explicit args, primary env fallback, secondary env fallback, and the panic paths when neither project nor location can be resolved. (0% to 100%.) - mistral_transport_test.go: peekModelAndStream now covered for empty bodies and malformed JSON. (66.7% to 100%.) - internal_test.go: each plugin's DefineModel exercised on the nil ai.ModelOptions branch. (50-57% to 75-85%.) Remaining 0% functions (Init, Name, AnthropicModel/LlamaModel/ MistralModel) require GCP credentials and are covered by the live tests.

…e Anthropic flow Vertex AI MaaS regional availability differs per publisher: Claude models live in us-east5 / europe-west4 while Llama and Mistral live in us-central1. The old sample assumed a single GOOGLE_CLOUD_LOCATION env which forced users to pick one region. Pass each plugin its own Location so all four flows work simultaneously with just GOOGLE_CLOUD_PROJECT set. Drop the second Anthropic flow (claude-sonnet-4-5-20250929) since most projects ship with only Claude Opus 4.5 enabled. Comment documents how to add more variants once they are enabled.

cabljac · 2026-05-11T15:48:30Z

Happy to split out the mistral stuff if needed

…ead error Previously a failure from io.ReadAll in mistralVertexTransport.RoundTrip returned the error without closing req.Body, leaking the underlying reader. The caller had no way to recover the body because GetBody was not yet set on the rewritten request. Close unconditionally, and add a test using a Reader that returns an error to verify Close runs in that path.

cabljac · 2026-05-11T16:04:18Z

Closing in favour of a split into two stacked PRs for easier review:

feat(go/plugins/vertexai/modelgarden): add Claude Opus 4.5, Llama, shared scaffolding #5296 — compat_oai half: Claude Opus 4.5 model additions, new Llama plugin, shared scaffolding (resolveVertexMaasEnv, provider const, Anthropic.DefineModel guard). Low-risk; pattern-matches the existing Anthropic plugin.
feat(go/plugins/vertexai/modelgarden): add Mistral via per-model rawPredict #5297 (draft, stacked on feat(go/plugins/vertexai/modelgarden): add Claude Opus 4.5, Llama, shared scaffolding #5296) — Mistral half: new Mistral / Codestral plugin via per-model :rawPredict (Vertex does not serve Mistral via the OpenAI-compatible chat completions endpoint), an http.RoundTripper that rewrites outbound URLs in-flight, and the corresponding sample flows and tests.

Branch feat/go-modelgarden-new-models stays around as the historical record (it is the source the two new branches were cut from).

github-project-automation Bot added this to Genkit Backlog Apr 23, 2026

github-actions Bot added the go label Apr 23, 2026

gemini-code-assist Bot reviewed Apr 23, 2026

View reviewed changes

cabljac force-pushed the feat/go-modelgarden-new-models branch from 1f71e45 to 93d4ed3 Compare April 23, 2026 10:19

gemini-code-assist Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread go/plugins/vertexai/modelgarden/llama.go

Comment thread go/plugins/vertexai/modelgarden/llama.go Outdated

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

Comment thread go/plugins/vertexai/modelgarden/models.go

Comment thread go/plugins/vertexai/modelgarden/llama.go Outdated

cabljac added 3 commits May 11, 2026 12:40

cabljac force-pushed the feat/go-modelgarden-new-models branch from a8e4f88 to f41ad13 Compare May 11, 2026 11:41

cabljac added 6 commits May 11, 2026 12:58

chore(go/plugins/vertexai/modelgarden): gofmt llama.go

7fcd03c

This was referenced May 11, 2026

feat(go/plugins/vertexai/modelgarden): add Claude Opus 4.5, Llama, shared scaffolding #5296

Merged

feat(go/plugins/vertexai/modelgarden): add Mistral via per-model rawPredict #5297

Open

cabljac closed this May 11, 2026

github-project-automation Bot moved this to Done in Genkit Backlog May 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(go/plugins/vertexai/modelgarden): Support Claude Opus 4.5, Llama, Mistral#5175

feat(go/plugins/vertexai/modelgarden): Support Claude Opus 4.5, Llama, Mistral#5175
cabljac wants to merge 10 commits into
mainfrom
feat/go-modelgarden-new-models

cabljac commented Apr 23, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cabljac commented Apr 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

cabljac commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

cabljac commented May 11, 2026

Uh oh!

cabljac commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cabljac commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Models

Plugin scaffolding

Routing notes

Sample

Tests

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cabljac commented Apr 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

cabljac commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

cabljac commented May 11, 2026

Uh oh!

cabljac commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cabljac commented Apr 23, 2026 •

edited

Loading