Skip to content

Conversation

@thpierce
Copy link
Contributor

Fixes

Summary

feat(evals): enhance framework with validation, discovery, and reliability improvements

Changes

  • Add recursive task file discovery for nested task directories
  • Implement ToolCallValidator for verifying tool call sequences
  • Add verbose mode with captured data output and validation reasoning
  • Fix git diff captor to use correct working directory
  • Enhance tool call logging with input parameters
  • Configure Bedrock client with connection pooling and retries
  • Add file tools constant for validator filtering
  • Update documentation with new command structure

User experience

N/A

Checklist

If your change doesn't seem to apply, please leave them unchecked.

  • I have reviewed the contributing guidelines
  • I have performed a self-review of this change
  • Changes have been tested
  • Changes are documented

Is this a breaking change? (Y/N) N

RFC issue number: N/A

Checklist:

  • Migration process documented
  • Implement warnings (if it can live side by side)

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

…ility improvements

- Add recursive task file discovery for nested task directories
- Implement ToolCallValidator for verifying tool call sequences
- Improve verbose mode with captured data output and validation reasoning
- Fix git diff captor to use correct working directory
- Enhance tool call logging with input parameters
- Configure Bedrock client with connection pooling and retries
- Add file tools constant for validator filtering
- Update documentation with new command structure
Copy link
Contributor

@yiyuan-he yiyuan-he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thpierce thpierce requested a review from wangzlei as a code owner November 14, 2025 02:35
@thpierce thpierce added this pull request to the merge queue Nov 14, 2025
Merged via the queue into awslabs:main with commit 33bd025 Nov 14, 2025
141 checks passed
@thpierce thpierce deleted the investigations branch November 14, 2025 02:51
@github-project-automation github-project-automation bot moved this from To triage to Done in awslabs/mcp Project Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants