Add accuracy_sample_count #2414

v-shobhit · 2025-12-17T13:15:23Z

In the future, benchmarks (like gpt-oss) may have separate perf and accuracy datasets

This PR adds a separate config field, accuracy_sample_count, to set the number of samples in the acc eval dataset - separate from the existing performance_sample_count which will be used for the size of the perf eval dataset.

This new field defaults to performance_sample_count for backwards compatibility.

github-actions · 2025-12-17T13:15:32Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

nvzhihanj · 2025-12-17T16:43:01Z

@pgmpablo157321 @tanvi-mlcommons @mrmhodak please help review this PR - the accuracy sample count is something new we add to separate the accuracy and performance test dataset. Can you help review and suggest what else is needed for this feature?

mrmhodak · 2025-12-17T17:47:54Z

@pgmpablo157321: Please take a look to see if you agree with this.

add accuracy_sample_count

940612d

v-shobhit requested a review from a team as a code owner December 17, 2025 13:15

v-shobhit and others added 3 commits December 17, 2025 13:25

cap count to QSL->TotalSampleCount()

f424dee

[Automated Commit] Format Codebase

6368a90

empty commit to re-trigger test

3f2f719

v-shobhit force-pushed the shobhitv/acc_sample_count branch from 865c33b to 3f2f719 Compare December 17, 2025 13:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add accuracy_sample_count #2414

Add accuracy_sample_count #2414

v-shobhit commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025 •

edited

Loading

Uh oh!

nvzhihanj commented Dec 17, 2025

Uh oh!

mrmhodak commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add accuracy_sample_count #2414

Are you sure you want to change the base?

Add accuracy_sample_count #2414

Conversation

v-shobhit commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nvzhihanj commented Dec 17, 2025

Uh oh!

mrmhodak commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Dec 17, 2025 •

edited

Loading