-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
#12 introduces the component-schemas tool, which provides structured JSON Schema data for PatternFly React components.
We need to benchmark the updated MCP server using actual AI agents in Cursor (or similar AI coding assistants).
Implement a benchmark that:
- Uses AI Agent: Integrate with Cursor's AI agent or use an LLM API (e.g., Claude, GPT-4) to generate actual AI responses
- Tests on a "Golden Set": Create a curated set of prompts that represent common developer queries about PatternFly components
- Compares Two Scenarios:
- Before: AI agent using only the
use-patternfly-docsandfetch-docstools (parsing unstructured documentation) - After: AI agent using all tools including
component-schemas(structured JSON Schema data)
- Before: AI agent using only the
- Measures Metrics:
- Accuracy: Did the agent provide the correct prop names / types?
- Completeness: Did it find all relevant props?
- Response Time: How long did it take to generate response?
Success Criteria
- Golden set of prompts
- Benchmark script that runs prompts against the MCP server (with and without
component-schemas) - LLM responses captured and evaluated for accuracy
- Document results and share findings
Metadata
Metadata
Assignees
Labels
No labels