Fix qwen moe Load balancing loss calculation outside training #42133

diegoakel · 2025-11-10T18:29:22Z

What does this PR do?

Removes the calculation of load balancing loss for Qwen MoE models when outside training mode.

Fixes #42100

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@Rocketknight1

github-actions · 2025-11-11T14:06:51Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen3_moe, qwen3_omni_moe, qwen3_vl_moe

Rocketknight1 · 2025-11-11T15:43:56Z

Posted in #42100 as well, but do you know how output_router_logits is getting set? This should only happen when that's set to True.

fix qwen moe lb loss calc outside training

191210c

diegoakel marked this pull request as draft November 10, 2025 18:33

uses self.training and fix test

9646216

diegoakel marked this pull request as ready for review November 10, 2025 19:09

github-actions bot requested review from ArthurZucker and Rocketknight1 November 10, 2025 19:10

diegoakel added 3 commits November 11, 2025 12:45

missing self.training

eaea27b

Merge branch 'main' into fix-qwen3moe-lb-loss

ed48315

forget the fix-copies

082b2a6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix qwen moe Load balancing loss calculation outside training #42133

Fix qwen moe Load balancing loss calculation outside training #42133

Uh oh!

diegoakel commented Nov 10, 2025

Uh oh!

github-actions bot commented Nov 11, 2025

Uh oh!

Rocketknight1 commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix qwen moe Load balancing loss calculation outside training #42133

Are you sure you want to change the base?

Fix qwen moe Load balancing loss calculation outside training #42133

Uh oh!

Conversation

diegoakel commented Nov 10, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

github-actions bot commented Nov 11, 2025

Uh oh!

Rocketknight1 commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants