feat: add custom output directory support to vf-eval by ericlau98 · Pull Request #665 · PrimeIntellect-ai/verifiers

ericlau98 · 2025-12-24T18:24:13Z

Summary

Add optional path argument to -s/--save-results flag in vf-eval
When path provided, results save to custom location instead of default ./outputs
Adds output_dir field to EvalConfig and updates path_utils.py

Usage

vf-eval gsm8k -s                 # save to default location
vf-eval gsm8k -s /custom/path    # save to custom location

Note

Enables saving eval outputs to a custom directory and adds a new RL config.

CLI: --save-results/-s now optionally accepts a PATH; without a path it behaves as a boolean flag. The chosen path is propagated as output_dir.
Plumb output_dir through the stack: add output_dir to EvalConfig, pass it from verifiers/scripts/eval.py, and make get_eval_results_path in verifiers/utils/path_utils.py honor it while preserving default ./outputs behavior.
Add configs/vf-rl/math-python.toml with env.id and run_name set to math-python.

^{Written by Cursor Bugbot for commit cd4c172. This will update automatically on new commits. Configure here.}

Allow users to specify a custom output path via the -s/--save-results flag. When a path is provided, results are saved there instead of the default ./outputs or environment-local outputs directory. Usage: vf-eval gsm8k -s # default location vf-eval gsm8k -s /custom/path # custom location

CLAassistant · 2025-12-24T18:24:29Z

All committers have signed the CLA.

cursor

Config file references wrong environment ID

The math-python.toml configuration file sets id = "primeintellect/wiki-search" and run_name = "wiki-search", but it should reference the math-python environment instead. The environments/math_python/ directory exists with proper metadata indicating the environment ID is math-python. This appears to be a copy-paste error from another config file, causing the math-python config to load an entirely different environment when used.

configs/vf-rl/math-python.toml#L3-L4

verifiers/configs/vf-rl/math-python.toml

Lines 3 to 4 in 0e0e872

    
           [env] 
        
           id = "primeintellect/wiki-search"

configs/vf-rl/math-python.toml#L17-L18

verifiers/configs/vf-rl/math-python.toml

Lines 17 to 18 in 0e0e872

    
           [trainer.args] 
        
           run_name = "wiki-search"

The config file had copy-paste errors referencing wiki-search instead of math-python for both the environment ID and run_name.

ericlau98 · 2025-12-24T22:14:25Z

Updated the math-python.toml based purely on what the Bugbot said, otherwise didn't touch that file. All existing functionality is still built in, nothing changes about default behaviour. I added the ability to define a custom output path using the -s/--save-results flag to allow for streamlining custom workflows. @willccbb Please let me know if you need anything else from me, would love to have this merged.

mikasenghaas

nice, lgtm

willccbb · 2026-01-03T05:25:33Z

Can ignore the bugbot comment, not related to the PR.

What's the main use case for you for custom save paths? I get that it's maybe nice to have sometimes, but this might conflict with some other planned changes + not sure we're gonna want to maintain the plumbing here as-is. Definitely a nice feature idea though, wanna maybe open as an issue? Can make sure we incorporate something like this into upcoming updates.

ericlau98 · 2026-01-03T05:43:08Z

Can ignore the bugbot comment, not related to the PR.

What's the main use case for you for custom save paths? I get that it's maybe nice to have sometimes, but this might conflict with some other planned changes + not sure we're gonna want to maintain the plumbing here as-is. Definitely a nice feature idea though, wanna maybe open as an issue? Can make sure we incorporate something like this into upcoming updates.

@willccbb The use case is for automation! The change maintains existing functionality should no custom path be specified. Are there any specific concerns I'm not thinking of?

willccbb · 2026-01-03T05:51:45Z

we're planning a more comprehensive update to vf-eval / related features to support better configurability + more robustness for building on top of it; currently it's intended as a pretty lightweight manual helper script

will leave this open + think about it a bit more

for a stopgap, would prefer if we can do this without the dual-use of -s and just have a more explicit flag

ericlau98 changed the base branch from dev to main December 24, 2025 18:27

cursor bot reviewed Dec 24, 2025

View reviewed changes

fix: correct environment ID in math-python.toml config

cd4c172

The config file had copy-paste errors referencing wiki-search instead of math-python for both the environment ID and run_name.

ericlau98 force-pushed the feat/custom-output-dir branch from 00d66a2 to cd4c172 Compare December 24, 2025 19:00

mikasenghaas approved these changes Dec 26, 2025

View reviewed changes

willccbb marked this pull request as draft January 3, 2026 05:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add custom output directory support to vf-eval#665

feat: add custom output directory support to vf-eval#665
ericlau98 wants to merge 2 commits intoPrimeIntellect-ai:mainfrom
ericlau98:feat/custom-output-dir

ericlau98 commented Dec 24, 2025 •

edited by cursor bot

Loading

Uh oh!

CLAassistant commented Dec 24, 2025 •

edited

Loading

Uh oh!

cursor bot left a comment

Uh oh!

ericlau98 commented Dec 24, 2025

Uh oh!

mikasenghaas left a comment

Uh oh!

willccbb commented Jan 3, 2026

Uh oh!

ericlau98 commented Jan 3, 2026 •

edited

Loading

Uh oh!

willccbb commented Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ericlau98 commented Dec 24, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Usage

Uh oh!

CLAassistant commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Config file references wrong environment ID

Uh oh!

ericlau98 commented Dec 24, 2025

Uh oh!

mikasenghaas left a comment

Choose a reason for hiding this comment

Uh oh!

willccbb commented Jan 3, 2026

Uh oh!

ericlau98 commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

willccbb commented Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ericlau98 commented Dec 24, 2025 •

edited by cursor bot

Loading

CLAassistant commented Dec 24, 2025 •

edited

Loading

ericlau98 commented Jan 3, 2026 •

edited

Loading