-
Notifications
You must be signed in to change notification settings - Fork 166
Description
Describe the bug
When a config inherits from another object like with the teacher config for distillation, overriding the parent config's settings don't get reflected in the child's config.
For example, the teacher config in distillation uses the same config as the policy but changes the model_name and a few parallelism sizes. If you were to change something in the policy by overriding it in the CLI, like policy.max_total_sequence_length, it will only update the policy setting, but not the teacher. This will cause problems when using larger sequence lengths as it can throw errors with the mismatch.
While settings like these could instead be overriden for both configs, it can be confusing for users as glancing at the YAML file only shows the settings in one place, so it is not obvious that it needs to be updated again, or that it isn't updated in both locations.
Steps/Code to reproduce bug
- Setup the NeMo-RL container and run a RayCluster.
- Run the
examples/run_distillation_math.pyexample withuv run examples/run_distillation_math.py policy.max_total_sequence_length. - Observe the
max_total_sequence_lengthfor the teacher.
Expected behavior
If I override a config value that another config inherits from, I would expect both of them to be updated. Otherwise, it is confusing what value it would be using.