Skip to content

Fix missing mm_token_type_ids when training new Qwen VLMs with liger kernel#6234

Open
apardyl wants to merge 1 commit into
huggingface:mainfrom
apardyl:qwenligerfix
Open

Fix missing mm_token_type_ids when training new Qwen VLMs with liger kernel#6234
apardyl wants to merge 1 commit into
huggingface:mainfrom
apardyl:qwenligerfix

Conversation

@apardyl

@apardyl apardyl commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Newer versions of the Qwen model (e.g., Qwen3.5) introduce a new input field, mm_token_type_ids, which is currently handled only in the standard loss computation path. This PR extends support for this field to the Liger Kernel path as well, enabling training of newer Qwen VLMs with the Liger Kernel loss. Simple tests included.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

AI writing disclosure

We welcome the use of AI tools to help with contributions. For transparency and to help us improve our review process, please indicate the level of AI involvement in this PR.

  • No AI usage: the PR was written entirely by a human.
  • AI-assisted: some parts were suggested or improved by AI, but the PR was written and reviewed by a human.
  • AI-generated: the PR was mostly or fully generated by an AI tool.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.


Note

Low Risk
Narrow, conditional forward-arg change gated on pixel_values; standard loss path already handled the field. Risk is mainly incorrect VLM forward inputs for Qwen3.5+ with Liger.

Overview
Fixes Qwen3.5+ VLM GRPO training with Liger by threading mm_token_type_ids through the fused-loss path, which previously omitted this input while the standard log-prob forward already included it.

_get_last_hidden_state now accepts mm_token_type_ids and adds it to backbone model_inputs when pixel_values is present (alongside existing Qwen fields like image_grid_thw). The Liger loss step passes inputs.get("mm_token_type_ids") into that helper.

Tests add Qwen3.5 to test_train_vlm_and_liger (transformers ≥ 5.2.0) and test_get_last_hidden_state_passes_mm_token_type_ids, which asserts scored batches include the field and that backbone forward receives matching tensors.

Reviewed by Cursor Bugbot for commit 8a0962a. Bugbot is set up for automated code reviews on this repo. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant