test(agents): added agent conformance spec and js harness by pavelgj · Pull Request #5256 · genkit-ai/genkit

pavelgj · 2026-05-07T00:56:56Z

No description provided.

Upgrade postcss from 8.4.31 to 8.5.12 across all workspaces and add new markdown/MDX-related dependencies (remark, rehype, unified, etc.) to the lockfile.

…lify workspaceAgent - Rename `simple-agent` to `custom-agent` and update all imports, routes, and log messages accordingly - Refactor `workspace-builder.ts` to use `defineAgent` instead of `defineCustomAgent`, leveraging the standard agent API for model calls, tool dispatch, streaming, and message management - Extract `emitArtifact` tool to module scope using `ai.defineTool`

Add new type exports for `agent` and `agents-conformance` modules to the common types barrel file.

…ance tests Introduce Phase 2 of agent conformance testing with `defineCustomAgent` and four new custom agents (blocking, failing, withArtifacts, withCustomState). Add support for detach/background execution, abort, artifact streaming/deduplication, and custom state persistence. - Add `defineCustomAgent` API in session-flow.ts for agents with fixed deterministic logic (no programmable model needed) - Implement artifact support with `addArtifact` on session and streaming via `agentArtifacts` chunks - Add custom state read/write (`getCustomState`/`setCustomState`) persisted across invocations - Add detach support (`detach` flag, `waitUntilCompleted` helper) for background agent execution - Add abort support to cancel pending agent snapshots - Extend test spec YAML with 10 new tests (16 total) covering detach, abort, artifacts, and custom state categories - Update conformance testing docs with custom agent table and test coverage summary

- Update expectChunks to semi-strict type-aware matching logic - Preserve null values in deepStrip to distinguish null vs absent - Fix abort expectPreviousStatus to support YAML null (~) correctly - Add stateContains message subsequence assertions to test specs - Update conformance testing docs to reflect matching semantics

Add `errorContains` snapshot assertion for subset-matching on `snapshot.error`. Introduce three new conformance tests: server-managed state ignoring init state, pure detach without payload, and failed snapshot error details. Update test count from 16 to 19 in docs.

Standardize terminology throughout the agent conformance testing documentation and tooling by replacing "invocations" with "steps" to better describe the ordered sequence of operations in test cases.

gemini-code-assist

Code Review

This pull request establishes a comprehensive Agent Conformance Testing framework, introducing a shared YAML specification, detailed documentation, and a reference JavaScript test harness. It includes new Zod schemas for the agent wire protocol and conformance test format to ensure cross-language compatibility. Feedback focuses on improving maintainability by reducing schema duplication across packages and enhancing type safety by replacing 'any' types with more specific schemas or 'unknown' in both the core types and the new translator test application.

I am having trouble creating individual review comments. Click here to see my feedback.

js/ai/tests/agents_spec_test.ts (54-117)

These Zod schemas are duplicated from genkit-tools/common/src/types/agents-conformance.ts. The comment on line 51 explains this is because js/ai does not depend on genkit-tools/common. This duplication is a significant maintainability risk, as changes in the canonical schemas can be missed here, leading to inconsistencies and test failures.

Consider refactoring the package dependencies to allow js/ai to import these schemas directly from genkit-tools/common. This would make the test harness more robust and easier to maintain.

genkit-tools/common/src/types/agent.ts (90)

The status property is typed as z.any(), which is very permissive and reduces type safety. If the structure of status is known, it should be defined with a more specific schema. If it's truly unknown, z.unknown() is a safer alternative to z.any() as it forces validation on consumers. Since this property doesn't seem to be used in the current test spec, now would be a good time to improve its type definition for future use.

  status: z.unknown().optional(),

js/testapps/agents/web/src/pages/Translator.tsx (71)

Casting the result of runFlow to any bypasses TypeScript's type safety. This can lead to runtime errors if the API response structure changes. It would be safer to define a type for the expected response and then validate or cast to that specific type. For example, you could use a Zod schema to parse the response.

Add `expectError` assertion to the agent test spec runner, allowing tests to verify that an invocation throws an expected error message. Update the "server-managed agent ignores init state" test to instead verify that sending `state` to a server-managed agent throws a FAILED_PRECONDITION error, rather than silently ignoring it.

Remove the optional `newSnapshotId` property from `AgentInitSchema` as it is no longer needed in the agent initialization configuration.

Update the "abort completed agent" test to expect that a snapshot in the "done" state remains "done" after an abort request. Terminal states (done, failed, aborted) cannot be overridden — only "pending" can transition to "aborted".

pavelgj added 12 commits May 6, 2026 20:47

feat: implemented session flow

0dfa9b5

chore(js): update pnpm-lock.yaml with postcss and mdx dependencies

efb739f

Upgrade postcss from 8.4.31 to 8.5.12 across all workspaces and add new markdown/MDX-related dependencies (remark, rehype, unified, etc.) to the lockfile.

renames

aae12cd

samples

92e2054

remove inputvars

bb44dcf

feat: export agent and agents-conformance types from common types

f659f43

Add new type exports for `agent` and `agents-conformance` modules to the common types barrel file.

fmt and type fixes

fe12606

docs: rename "invocations" to "steps" in agent conformance spec

29b3326

Standardize terminology throughout the agent conformance testing documentation and tooling by replacing "invocations" with "steps" to better describe the ordered sequence of operations in test cases.

github-project-automation Bot added this to Genkit Backlog May 7, 2026

github-actions Bot added docs Improvements or additions to documentation js tooling config test labels May 7, 2026

pavelgj added 3 commits May 6, 2026 20:58

undo

41768bd

undo

8c92fb0

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

12bc777

gemini-code-assist Bot reviewed May 7, 2026

View reviewed changes

pavelgj added 8 commits May 6, 2026 21:07

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

cea9ee9

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

a2577df

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

b7926df

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

ae5425e

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

4d1bfd1

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

1b4184a

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

96c749b

pavelgj added 17 commits May 8, 2026 19:24

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

33bc652

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

8b17625

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

ce35692

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

ac27b49

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

275031d

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

87cad90

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

7d892a6

resume/respond/restart improvements

82f3626

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

af544a0

s/session flow/agent/g

30d9521

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

0610a3c

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

454f658

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

9d416de

feat: remove unused newSnapshotId field from AgentInitSchema

361946c

Remove the optional `newSnapshotId` property from `AgentInitSchema` as it is no longer needed in the agent initialization configuration.

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

68169f0

Merge branch 'pj/agents-sample' into pj/agents-conformance-tests

aab83af

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(agents): added agent conformance spec and js harness#5256

test(agents): added agent conformance spec and js harness#5256
pavelgj wants to merge 40 commits into
pj/agents-samplefrom
pj/agents-conformance-tests

pavelgj commented May 7, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pavelgj commented May 7, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

js/ai/tests/agents_spec_test.ts (54-117)

genkit-tools/common/src/types/agent.ts (90)

js/testapps/agents/web/src/pages/Translator.tsx (71)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant