Align KTO with DPO: Support tool calling by qgallouedec · Pull Request #6259 · huggingface/trl

qgallouedec · 2026-07-02T19:41:15Z

KTOTrainer ignored the tools column during tokenization, so tool schemas were never rendered into the prompt. Tool-calling datasets trained as if no tools were defined.

The existing test_train_toolcall_data (mirrors DPO's, uses trl-internal-testing/toolcall) now genuinely exercises tool rendering; it passes.

Matches DPO's behavior: tool calling is supported on the text path; the vision collator does not wire tools through in either trainer.

#4786

Note

Low Risk
Small change to KTO preprocessing only; it corrects tokenization for datasets that already carry a tools column and does not touch loss, reference model, or auth paths.

Overview
KTO text-path training now renders tool schemas in prompts, matching DPO behavior for tool-calling datasets.

During dataset tokenization, KTOTrainer reads each example’s optional tools field (JSON-parsing when stored as a string) and forwards it into apply_chat_template for both the generation prompt and the full prompt+completion sequence. Previously those columns were ignored, so KTO trained on tool data as if no tools existed.

For vision datasets, "tools" is added to the trainer’s signature columns so it is retained when unused columns are removed (the vision collator still does not wire tools through, same as DPO).

^{Reviewed by Cursor Bugbot for commit c198c65. Bugbot is set up for automated code reviews on this repo. Configure here.}

bot-ci-comment · 2026-07-02T19:43:52Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Align KTO with DPO: Support tool calling

4795778

qgallouedec requested a review from albertvillanova July 2, 2026 19:41

Merge branch 'main' into kto-tool-calling-support

c198c65

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Align KTO with DPO: Support tool calling#6259

Align KTO with DPO: Support tool calling#6259
qgallouedec wants to merge 2 commits into
mainfrom
kto-tool-calling-support

qgallouedec commented Jul 2, 2026 •

edited by cursor Bot

Loading

Uh oh!

bot-ci-comment Bot commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

qgallouedec commented Jul 2, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bot-ci-comment Bot commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qgallouedec commented Jul 2, 2026 •

edited by cursor Bot

Loading