Skip to content

testing: Acceptance Criteria Tests #474

@josephjclark

Description

@josephjclark

Establish an effective suite of acceptence criteria tests.

These tests are designed to be manually reviewed by a product stakeholder or an LLM model. They may be sent to langfuse for analysis.

They are focused on quality and style of answers to key questions. They should be easy to audit by a Joe or a Brandon to ensure the quality and voice of the AI Assistant throughout development (particularly important after model version updates)

Here are the principles of acceptance criteria tests:

  • They are implemented as HTTP requests against the bun server, but this is not surfaced
  • They include live model calls
  • They are likely evaluated by an LLM to determine pass/fail status
  • Test suites must be richly defined and easily evaluated. Tests might be markdown files, for example, with a question, some data, and a set of natural language assertions
  • When designing new AI features, the product owner may specify some hero questions/conversations to drive development. Those questions would make good acceptance criteria tests
  • We may want the results to be checked into git so that responses can be compared

Metadata

Metadata

Labels

Type

No type
No fields configured for issues without a type.

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions