Skip to content

CLI overrides don't get propagated for inherited configs #1483

@roclark

Description

@roclark

Describe the bug

When a config inherits from another object like with the teacher config for distillation, overriding the parent config's settings don't get reflected in the child's config.

For example, the teacher config in distillation uses the same config as the policy but changes the model_name and a few parallelism sizes. If you were to change something in the policy by overriding it in the CLI, like policy.max_total_sequence_length, it will only update the policy setting, but not the teacher. This will cause problems when using larger sequence lengths as it can throw errors with the mismatch.

While settings like these could instead be overriden for both configs, it can be confusing for users as glancing at the YAML file only shows the settings in one place, so it is not obvious that it needs to be updated again, or that it isn't updated in both locations.

Steps/Code to reproduce bug

  1. Setup the NeMo-RL container and run a RayCluster.
  2. Run the examples/run_distillation_math.py example with uv run examples/run_distillation_math.py policy.max_total_sequence_length.
  3. Observe the max_total_sequence_length for the teacher.

Expected behavior

If I override a config value that another config inherits from, I would expect both of them to be updated. Otherwise, it is confusing what value it would be using.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions