Refactor: Improve clarity and robustness in route_tokens_to_experts #42114

danielquintas8 · 2025-11-09T20:25:36Z

What does this PR do?

This PR includes two small improvements for route_tokens_to_experts in MoE models:

Improves Readability: Renames the function's input argument from hidden_states to router_logits. This more accurately reflects that the function receives the output of the gating network.
Improves Robustness: Changes the softmax dimension from dim=1 to dim=-1. While functionally identical in the current implementation, dim=-1 is more explicit and ensures the logic remains correct if the tensor shape ever changes.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@ArthurZucker
@Cyrilvallez

…n route_tokens_to_experts method and correct argument naming

github-actions · 2025-11-12T17:10:20Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: dbrx, ernie4_5_moe, flex_olmo, hunyuan_v1_moe, jamba, olmoe, qwen2_moe, qwen3_moe, qwen3_next, qwen3_omni_moe

danielquintas8 and others added 3 commits November 9, 2025 19:29

Refactor routing logic in multiple models to use dim=-1 for softmax i…

97440f3

…n route_tokens_to_experts method and correct argument naming

modeling files

229ab6c

Merge branch 'main' into refactor-moe-routing

292a120

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor: Improve clarity and robustness in route_tokens_to_experts #42114

Refactor: Improve clarity and robustness in route_tokens_to_experts #42114

Uh oh!

danielquintas8 commented Nov 9, 2025

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Refactor: Improve clarity and robustness in route_tokens_to_experts #42114

Are you sure you want to change the base?

Refactor: Improve clarity and robustness in route_tokens_to_experts #42114

Uh oh!

Conversation

danielquintas8 commented Nov 9, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant