Skip to content

Align KTO doc with DPO and fix Logged metrics wording#6258

Merged
qgallouedec merged 6 commits into
mainfrom
docs-align-kto-and-logged-metrics
Jul 3, 2026
Merged

Align KTO doc with DPO and fix Logged metrics wording#6258
qgallouedec merged 6 commits into
mainfrom
docs-align-kto-and-logged-metrics

Conversation

@qgallouedec

@qgallouedec qgallouedec commented Jul 2, 2026

Copy link
Copy Markdown
Member

Docs only.

KTO doc (kto_trainer.md) restructured to mirror dpo_trainer.md: added "Looking deeper into the KTO method" (loss + Loss Types table), "Customization" (constraints, model init, PEFT, Liger), "Tool Calling", and "Training Vision Language Models"; rewrote the "Logged metrics" section, which was wrong (listed non-existent rewards/chosen_sum, count/chosen, …); aligned shared wording with DPO. Kept KTO-only content (experimental warning, Example script, Usage tips).

Logged metrics wording across trainer docs:

  • "reward metrics" → "metrics" in 11 docs (the lists include loss, learning_rate, entropy, num_tokens, … — not just rewards), matching tpo_trainer.md.
  • DPO loss description fixed: was "cross-entropy loss" (copied from SFT), now "DPO loss" — DPO's loss is the preference loss, not token cross-entropy.

Notes: KTO's metrics list includes entropy/num_tokens (from two in-flight PRs) and the "Tool Calling" section assumes tool support from a separate PR: land those first. All referenced dataset/model IDs are real.


Note

Low Risk
Changes are limited to Markdown documentation with no runtime or training code modified.

Overview
Documentation-only updates across TRL trainer guides.

KTO (kto_trainer.md) is expanded to follow the same structure as DPO: overview/badge tweaks, dataset format examples, a Looking deeper into the KTO method section (loss math + loss_type table), a corrected Logged metrics list (replacing obsolete *_sum / count/* names), Customization (Liger/PEFT/constraints), plus Tool calling and VLM sections. Contributor credit and section ordering are adjusted; usage tips stay at the end.

Logged metrics intros in 11 trainer docs (CPO, DPO, GRPO, Nash-MD, Online DPO, ORPO, Reward, RLOO, SFT, TPO, XPO) now say metrics instead of reward metrics, since many logged fields are not rewards.

DPO clarifies that logged loss is the average DPO preference loss, not token cross-entropy (which had been copied from SFT).

Reviewed by Cursor Bugbot for commit 0bf3551. Bugbot is set up for automated code reviews on this repo. Configure here.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 83869672d5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +140 to +142
* `num_tokens`: The total number of tokens processed so far.
* `loss`: The average KTO loss over the current logging interval.
* `entropy`: The average entropy of the model's predicted token distribution over non-masked tokens.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove KTO metrics that are never logged

For KTO, _compute_loss only appends kl, rewards/*, logps/*, and logits/* to self._metrics, and log() just averages those; unlike the SFT/Reward trainers, it never updates a num_tokens counter or computes entropy. Users following this section will look for num_tokens and entropy in KTO runs, but those fields are not emitted.

Useful? React with 👍 / 👎.

The [`experimental.kto.KTOTrainer`] fully supports fine-tuning models with _tool calling_ capabilities. In this case, each dataset example should include:

* The conversation messages (prompt and completion), including any tool calls (`tool_calls`) and tool responses (`tool` role messages)
* The list of available tools in the `tools` column, typically provided as JSON schemas

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop documenting unsupported KTO tool schemas

KTO's preprocessing does not consume the documented tools column: tokenize_fn only forwards example.get("chat_template_kwargs", {}) into processing_class.apply_chat_template, and _set_signature_columns_if_needed does not retain tools for raw datasets. Examples that follow this documented format therefore render prompts without the available tool schemas, so tool-calling fine-tunes that rely on the standard tools column train on the wrong prompt.

Useful? React with 👍 / 👎.

@bot-ci-comment

bot-ci-comment Bot commented Jul 2, 2026

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment thread docs/source/dpo_trainer.md Outdated
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Comment thread docs/source/kto_trainer.md Outdated
@qgallouedec qgallouedec merged commit c2af123 into main Jul 3, 2026
3 checks passed
@qgallouedec qgallouedec deleted the docs-align-kto-and-logged-metrics branch July 3, 2026 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants