Fix missing mm_token_type_ids when training new Qwen VLMs with liger kernel by apardyl · Pull Request #6234 · huggingface/trl

apardyl · 2026-07-01T10:24:20Z

What does this PR do?

Newer versions of the Qwen model (e.g., Qwen3.5) introduce a new input field, mm_token_type_ids, which is currently handled only in the standard loss computation path. This PR extends support for this field to the Liger Kernel path as well, enabling training of newer Qwen VLMs with the Liger Kernel loss. Simple tests included.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline, Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

AI writing disclosure

We welcome the use of AI tools to help with contributions. For transparency and to help us improve our review process, please indicate the level of AI involvement in this PR.

No AI usage: the PR was written entirely by a human.
AI-assisted: some parts were suggested or improved by AI, but the PR was written and reviewed by a human.
AI-generated: the PR was mostly or fully generated by an AI tool.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

Note

Low Risk
Narrow, conditional forward-arg change gated on pixel_values; standard loss path already handled the field. Risk is mainly incorrect VLM forward inputs for Qwen3.5+ with Liger.

Overview
Fixes Qwen3.5+ VLM GRPO training with Liger by threading mm_token_type_ids through the fused-loss path, which previously omitted this input while the standard log-prob forward already included it.

_get_last_hidden_state now accepts mm_token_type_ids and adds it to backbone model_inputs when pixel_values is present (alongside existing Qwen fields like image_grid_thw). The Liger loss step passes inputs.get("mm_token_type_ids") into that helper.

Tests add Qwen3.5 to test_train_vlm_and_liger (transformers ≥ 5.2.0) and test_get_last_hidden_state_passes_mm_token_type_ids, which asserts scored batches include the field and that backbone forward receives matching tensors.

^{Reviewed by Cursor Bugbot for commit 8a0962a. Bugbot is set up for automated code reviews on this repo. Configure here.}

…nabled

Fix missing mm_token_type_ids input when training with liger kernel e…

8a0962a

…nabled

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix missing mm_token_type_ids when training new Qwen VLMs with liger kernel#6234

Fix missing mm_token_type_ids when training new Qwen VLMs with liger kernel#6234
apardyl wants to merge 1 commit into
huggingface:mainfrom
apardyl:qwenligerfix

apardyl commented Jul 1, 2026 •

edited by cursor Bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

apardyl commented Jul 1, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

AI writing disclosure

Who can review?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

apardyl commented Jul 1, 2026 •

edited by cursor Bot

Loading