Problem
When monitor_type: wandb is configured, Trinity creates two separate wandb runs — {name}_explorer and {name}_trainer. This makes it harder to correlate
explorer and trainer metrics (e.g., comparing rollout accuracy against actor loss), and clutters the wandb project with twice as many runs.
Proposed Solution
A primary process in the launcher creates the wandb run, and both Explorer and Trainer join as
secondary writers via wandb.init(id=run_id, resume="allow", mode="shared").
Planned Changes:
- config.py: Add wandb_run_id field to MonitorConfig (auto-populated by launcher)
- monitor.py: Add init_wandb_primary() / finish_wandb_primary() helpers; modify WandbMonitor to support shared (secondary) mode
- launcher.py: In both(), create the primary run before spawning actors and pass wandb_run_id through config
Expected Outcome:
- In shared mode, all metrics are prefixed with their role (explorer/ or trainer/), producing two clean top-level sections in the wandb UI
- wandb.define_metric assigns independent step axes (explorer/step and trainer/step) so the two processes' step counters don't conflict
- for standalone mode, explore() and train() standalone entry points should remain unchanged — they would still create their own independent runs since wandb_run_id remains None.
Problem
When monitor_type: wandb is configured, Trinity creates two separate wandb runs — {name}_explorer and {name}_trainer. This makes it harder to correlate
explorer and trainer metrics (e.g., comparing rollout accuracy against actor loss), and clutters the wandb project with twice as many runs.
Proposed Solution
A primary process in the launcher creates the wandb run, and both Explorer and Trainer join as
secondary writers via wandb.init(id=run_id, resume="allow", mode="shared").
Planned Changes:
Expected Outcome: