feat(go/plugins/vertexai/modelgarden): add Claude Opus 4.5, Llama, shared scaffolding by cabljac · Pull Request #5296 · genkit-ai/genkit

cabljac · 2026-05-11T16:03:10Z

Extends the Vertex AI Model Garden Go plugin with new Anthropic Claude models, a new Meta Llama plugin, and shared scaffolding used by all modelgarden plugins.

This PR is the first half of the work originally proposed in #5175. The Mistral plugin (which requires a different transport approach because Vertex does not serve Mistral via the OpenAI-compatible endpoint) is stacked on top of this PR in a follow-up.

Models

Anthropic (existing plugin, more models): claude-opus-4-5@20251101, claude-opus-4-1-20250805, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001, plus the previously supported Claude 3.5 / 3.7 / Opus 4 / Sonnet 4 entries.
Meta Llama (MaaS, new plugin): meta/llama-4-maverick-17b-128e-instruct-maas, meta/llama-4-scout-17b-16e-instruct-maas, meta/llama-3.3-70b-instruct-maas. Llama 4 variants register as Multimodal; Llama 3.3 70B is text-only. Routed through compat_oai against the Vertex MaaS OpenAI-compatible endpoint.

Plugin scaffolding

New Llama plugin type alongside the existing Anthropic, with Init, Name, DefineModel(name, opts), and a top-level LlamaModel(g, id) lookup mirroring the Anthropic shape.
Shared resolveVertexMaasEnv helper for GOOGLE_CLOUD_PROJECT / GCLOUD_PROJECT / GOOGLE_CLOUD_LOCATION / GOOGLE_CLOUD_REGION resolution.
Shared provider = "vertexai" constant so every modelgarden plugin registers models under the same vertexai/<model> namespace.
Anthropic.DefineModel now takes the same mutex and initted guard as Llama.DefineModel, so calling it before Init returns an explicit "not initialized" error instead of using a zero-value client.

Sample

go/samples/modelgarden registers two flows for Dev UI smoke testing:

opus45Flow — claude-opus-4-5@20251101
llamaFlow — meta/llama-3.3-70b-instruct-maas

Each plugin is constructed with an explicit Location because Vertex MaaS regional availability differs per publisher (Anthropic in us-east5, Llama in us-central1).

Tests

define_model_test.go — pre-Init error path for both plugins.
internal_test.go — nil ai.ModelOptions branch of DefineModel for both plugins.
models_test.go — resolveVertexMaasEnv covered for explicit args, primary / secondary env fallback, and panic paths.
llama_live_test.go — basic + streaming generation against Llama 3.3 70B (credential-gated).
anthropic_live_test.go — Opus 4.5 subtest (credential-gated).

Test plan

go test ./plugins/vertexai/modelgarden/... (unit, no creds)
Live: GOOGLE_CLOUD_PROJECT=… GOOGLE_CLOUD_LOCATION=… go test -v -run 'TestAnthropicLive|TestLlamaLive' ./plugins/vertexai/modelgarden/...
Dev UI smoke: cd go/samples/modelgarden && GOOGLE_CLOUD_PROJECT=<project> genkit start -- go run ., then run each flow from Dev UI.

…ared scaffolding Extends the Vertex AI Model Garden Go plugin with new Anthropic Claude models and a new Meta Llama plugin, plus shared scaffolding used by all modelgarden plugins. Models: - Anthropic: claude-opus-4-5@20251101, claude-opus-4-1-20250805, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001, plus the previously supported 3.5 / 3.7 / Opus 4 / Sonnet 4 entries. - Meta Llama (MaaS, new plugin): meta/llama-4-maverick-17b-128e- instruct-maas, meta/llama-4-scout-17b-16e-instruct-maas, meta/llama-3.3-70b-instruct-maas. Llama 4 variants register as Multimodal; Llama 3.3 70B is text-only. Routed through compat_oai against the Vertex MaaS OpenAI-compatible endpoint. Plugin scaffolding: - New Llama plugin alongside the existing Anthropic, with Init, Name, DefineModel(name, opts), and a top-level LlamaModel(g, id) lookup mirroring the Anthropic shape. - Shared resolveVertexMaasEnv helper for GOOGLE_CLOUD_PROJECT / GCLOUD_PROJECT / GOOGLE_CLOUD_LOCATION / GOOGLE_CLOUD_REGION resolution. - Shared provider = "vertexai" constant so every modelgarden plugin registers models under the same vertexai/<model> namespace. - Anthropic.DefineModel now takes the same mutex and initted guard as Llama.DefineModel, so calling it before Init returns an explicit "not initialized" error instead of using a zero-value client. Sample: - go/samples/modelgarden registers an Anthropic and a Llama flow for Dev UI smoke testing. Each plugin is constructed with an explicit Location because Vertex MaaS regional availability differs per publisher (Anthropic in us-east5, Llama in us-central1). Tests: - define_model_test.go covers the pre-Init error path for both plugins. - internal_test.go covers the nil ai.ModelOptions branch of DefineModel for both plugins. - models_test.go covers resolveVertexMaasEnv for explicit args, primary and secondary env fallback, and panic paths. - llama_live_test.go exercises basic + streaming generation against meta/llama-3.3-70b-instruct-maas (credential-gated). - anthropic_live_test.go adds a claude-opus-4-5@20251101 subtest (credential-gated).

gemini-code-assist

Code Review

This pull request introduces a new Llama plugin for Vertex AI Model Garden, which leverages OpenAI-compatible endpoints and Google OAuth2 authentication. It also refactors environment variable resolution into a shared helper function used by both the Anthropic and Llama plugins. Additionally, the Anthropic plugin was updated with initialization checks in DefineModel and support for a new model version. Comprehensive unit, white-box, and live tests were added to ensure the reliability of the new features. Feedback was provided regarding the Llama plugin's initialization, specifically suggesting the use of context.Background() for the OAuth2 client to prevent issues with token refreshes if the initial context is short-lived.

Match the existing per-plugin convention seen in go/samples/anthropic and go/samples/compat_oai/{anthropic,custom,openai}. The existing go/samples/modelgarden sample stays single-flow Anthropic-only; a new go/samples/modelgarden-llama sample exercises the new Llama plugin in isolation.

…a oauth2 client The oauth2.NewClient and TokenSource outlive Init's ctx because every later generate call goes through them. If a caller passes a short-lived ctx to Init (e.g. one with a timeout, or one cancelled after plugin setup), token refresh on later calls would fail with the original ctx cancelled. Bind both to context.Background(). Reported by gemini-code-assist on #5296.

cabljac · 2026-05-11T16:47:38Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new Llama plugin for Vertex AI Model Garden, leveraging OpenAI-compatible endpoints, and refactors the Anthropic plugin to share environment variable resolution logic. It also adds initialization guards and mutex locking to model definition methods. Feedback highlights a potential issue with the Llama baseURL construction regarding missing publisher segments and suggests using safe type assertions when registering models to prevent potential panics.

cabljac · 2026-05-12T10:08:12Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new Llama plugin for Vertex AI Model Garden, leveraging OpenAI-compatible endpoints, and refactors environment variable resolution into a shared utility used by both Llama and Anthropic plugins. Additionally, it updates the Anthropic model list with Claude 4.5 Opus and adds several test suites and a usage sample. Feedback suggests improving the Llama plugin's initialization by using context.WithoutCancel to preserve context values and wrapping errors with more descriptive information for better diagnosability.

gemini-code-assist · 2026-05-12T10:14:33Z

+	ts, err := google.DefaultTokenSource(context.Background(), "https://www.googleapis.com/auth/cloud-platform")
+	if err != nil {
+		panic(fmt.Errorf("modelgarden llama: obtaining default Google token source: %w", err))
+	}
+	httpClient := oauth2.NewClient(context.Background(), ts)


Using a non-cancelled context for the DefaultTokenSource and oauth2.NewClient is appropriate as these objects must outlive the Init context. To better align with project standards, consider using context.WithoutCancel(ctx) instead of context.Background(). This ensures that while the context is not cancelled, it still carries potentially useful values (like trace IDs) from the parent context. Additionally, ensure any errors from these calls are wrapped with contextual information (e.g., 'failed to initialize vertex modelgarden client') to improve diagnosability.

References

When performing critical background operations that must survive client disconnection, use a context that is not cancelled (e.g., context.WithoutCancel) and wrap resulting errors with contextual information.

github-project-automation Bot added this to Genkit Backlog May 11, 2026

github-actions Bot added the go label May 11, 2026

This was referenced May 11, 2026

feat(go/plugins/vertexai/modelgarden): add Mistral via per-model rawPredict #5297

Draft

feat(go/plugins/vertexai/modelgarden): Support Claude Opus 4.5, Llama, Mistral #5175

Closed

gemini-code-assist Bot reviewed May 11, 2026

View reviewed changes

Comment thread go/plugins/vertexai/modelgarden/llama.go Outdated

cabljac marked this pull request as draft May 11, 2026 16:25

gemini-code-assist Bot reviewed May 11, 2026

View reviewed changes

Comment thread go/plugins/vertexai/modelgarden/llama.go

Comment thread go/plugins/vertexai/modelgarden/llama.go

gemini-code-assist Bot reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(go/plugins/vertexai/modelgarden): add Claude Opus 4.5, Llama, shared scaffolding#5296

feat(go/plugins/vertexai/modelgarden): add Claude Opus 4.5, Llama, shared scaffolding#5296
cabljac wants to merge 3 commits into
mainfrom
feat/go-modelgarden-compat-oai

cabljac commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

cabljac commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

cabljac commented May 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cabljac commented May 11, 2026

Models

Plugin scaffolding

Sample

Tests

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

cabljac commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

cabljac commented May 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant