refactor(gaia): regroup all hyperparameters in constants.py#371
refactor(gaia): regroup all hyperparameters in constants.py#371simonrosenberg wants to merge 2 commits intomainfrom
Conversation
d5476db to
ab84b31
Compare
This commit consolidates key constant values used in the GAIA benchmark implementation into a single constants.py file. Changes: - Create benchmarks/gaia/constants.py with essential constants: - Dataset configuration (DATASET_NAME, DATASET_YEAR, splits, levels) - Docker/image configuration (GAIA_BASE_IMAGE, TARGET_TYPE, IMAGE_TAG_PREFIX) - Runtime configuration (DEFAULT_RUNTIME_API_URL, DEFAULT_STARTUP_TIMEOUT) - Default values (model name, iterations, workers, etc.) - Update run_infer.py to import constants module and use constants - Update build_images.py to import constants module and use constants - Update eval_infer.py to import constants module and use constants Fixes #363 Co-authored-by: openhands <openhands@all-hands.dev>
ab84b31 to
fcd433e
Compare
|
@OpenHands /codereview-roasted |
|
I'm on it! simonrosenberg can track my progress at all-hands.dev |
Code Review Summary for PR #371: "refactor(gaia): regroup all hyperparameters in constants.py"🟡 Verdict: Acceptable - No behavioral changes, but incomplete execution✅ Behavioral Analysis: PASSAll constant values match their original hardcoded counterparts exactly:
No breaking changes detected.
|
|
@OpenHands Dead Code - These constants are defined but never used:DATASET_SPLIT_TEST, GAIA_LEVELS, DEFAULT_MAX_ITERATIONS, DEFAULT_CRITIC, DEFAULT_OUTPUT_DIR, DEFAULT_NUM_WORKERS, DEFAULT_MAX_ATTEMPTS If there is a mismatch with args_parser, the values should come from args_parser. Update constant values to match args parser Once you are done, update the PR description so it matches the current state of the code |
|
I'm on it! simonrosenberg can track my progress at all-hands.dev |
…faults Remove dead code constants that were never used: - DATASET_SPLIT_TEST (default split comes from args_parser) - GAIA_LEVELS (not used for validation) - DEFAULT_MAX_ITERATIONS (args_parser default: 100) - DEFAULT_CRITIC (args_parser default: 'pass') - DEFAULT_OUTPUT_DIR (args_parser default: './eval_outputs') - DEFAULT_NUM_WORKERS (args_parser default: 1) - DEFAULT_MAX_ATTEMPTS (args_parser default: 3) These constants were duplicating values already defined in benchmarks/utils/args_parser.py, which is the shared argument parser used by all benchmarks including GAIA. Co-authored-by: openhands <openhands@all-hands.dev>
SummaryI've addressed the dead code issue in PR #371 by removing the unused constants from Changes MadeRemoved unused constants:
Added documentation note explaining that CLI argument defaults are defined in Checklist
PR Link |
Summary
This PR consolidates all constant values and hyperparameters used in the GAIA benchmark implementation into a single
benchmarks/gaia/constants.pyfile, making it easier to review and modify benchmark configuration.Fixes #363
Changes
New File:
benchmarks/gaia/constants.pyCreated a centralized constants module with all constant values organized by category:
DATASET_NAME,DATASET_YEAR,DATASET_SPLIT_VALIDATIONGAIA_BASE_IMAGE,TARGET_TYPE,IMAGE_TAG_PREFIXDEFAULT_RUNTIME_API_URL,DEFAULT_STARTUP_TIMEOUTDEFAULT_MODEL_NAMENote: Default values for CLI arguments (max_iterations, num_workers, output_dir, max_attempts, critic, split) are defined in
benchmarks/utils/args_parser.py, which is the shared argument parser used by all benchmarks.Updated Files
run_infer.py: Updated to import and use constants fromconstants.pybuild_images.py: Updated to import and use constants fromconstants.pyeval_infer.py: Updated to import and use constants fromconstants.pyBenefits
Finaltype hints for better IDE support and type checkingargs_parser.pyto avoid duplication and potential mismatches