Skip to content

Fix teacher quantization kwargs and guard eval callback in GKD example#6251

Open
sergiopaniego wants to merge 2 commits into
mainfrom
fix-gkd-example-teacher-kwargs
Open

Fix teacher quantization kwargs and guard eval callback in GKD example#6251
sergiopaniego wants to merge 2 commits into
mainfrom
fix-gkd-example-teacher-kwargs

Conversation

@sergiopaniego

@sergiopaniego sergiopaniego commented Jul 2, 2026

Copy link
Copy Markdown
Member

What does this PR do?

Fixes two bugs in examples/scripts/gkd.py:

  • Under --load_in_4bit, the teacher's quantization block set model_kwargs (the student) instead of teacher_model_kwargs, so the teacher loaded unquantized. Aligns with distillation.py.
  • LogCompletionsCallback requires a prompt column and crashed eval on conversational (messages) datasets (ValueError: Column 'prompt' doesn't exist.); it is now attached only when that column exists.

Before submitting

AI writing disclosure

  • AI-assisted: some parts were suggested or improved by AI, but the PR was written and reviewed by a human.

Note

Low Risk
Example-script-only fixes with no changes to core TRL training logic or APIs.

Overview
Fixes two bugs in the GKD example script (gkd.py).

When --load_in_4bit (or other quantization) is enabled, quantization config is now applied to teacher_model_kwargs instead of mistakenly updating the student’s model_kwargs, so the teacher actually loads quantized—consistent with distillation.py.

LogCompletionsCallback is only registered when the eval split has a prompt column, avoiding a crash on conversational datasets that only expose messages.

Reviewed by Cursor Bugbot for commit 758035b. Bugbot is set up for automated code reviews on this repo. Configure here.

@bot-ci-comment

bot-ci-comment Bot commented Jul 2, 2026

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@sergiopaniego sergiopaniego force-pushed the fix-gkd-example-teacher-kwargs branch from 634868a to 3b9a209 Compare July 2, 2026 15:53
@sergiopaniego sergiopaniego changed the title Fix GKD example: teacher quantization kwargs and eval on conversational datasets Fix teacher quantization kwargs and guard eval callback in GKD example Jul 2, 2026
@sergiopaniego sergiopaniego requested a review from kashif July 2, 2026 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants