Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Align KTO with DPO: Support tool calling
#6259 opened Jul 2, 2026 by qgallouedec Member Loading…
Align KTO doc with DPO and fix Logged metrics wording
#6258 opened Jul 2, 2026 by qgallouedec Member Loading…
Align KTO with DPO: Log entropy metric
#6257 opened Jul 2, 2026 by qgallouedec Member Loading…
Align KTO with DPO: Log num_tokens metric
#6256 opened Jul 2, 2026 by qgallouedec Member Loading…
Raise a clear error when GKD student and teacher vocab sizes differ
#6252 opened Jul 2, 2026 by sergiopaniego Member Loading…
2 tasks done
Fix teacher quantization kwargs and guard eval callback in GKD example
#6251 opened Jul 2, 2026 by sergiopaniego Member Loading…
2 tasks done
implement message level rollout with linear trajectories
#6250 opened Jul 2, 2026 by AmineDiro Member Loading…
Align KTO with DPO: Align models import
#6248 opened Jul 2, 2026 by qgallouedec Member Loading…
Use trl's guarded is_liger_kernel_available in DPOTrainer
#6247 opened Jul 2, 2026 by qgallouedec Member Loading…
Align KTO with DPO: Align Liger loss naming
#6244 opened Jul 2, 2026 by albertvillanova Member Loading…
Fix activation offload storage dedupe reuse
#6241 opened Jul 2, 2026 by winglian Contributor Loading…
8 tasks
Fix BrowserGym example dependencies
#6240 opened Jul 2, 2026 by burtenshaw Collaborator Loading…
Drop vLLM 0.15 support
#6239 opened Jul 2, 2026 by qgallouedec Member Loading…
Environment-owned reward
#6238 opened Jul 1, 2026 by qgallouedec Member Loading…
1 task done
Dopd opsd routing
#6237 opened Jul 1, 2026 by ucalyptus Contributor Draft
Remove the PAPO trainer
#6235 opened Jul 1, 2026 by qgallouedec Member Loading…
Fix missing mm_token_type_ids when training new Qwen VLMs with liger kernel
#6234 opened Jul 1, 2026 by apardyl Contributor Loading…
4 of 8 tasks
Align ORPO with DPO: support iterable and dict eval datasets
#6230 opened Jul 1, 2026 by DaoyuanLi2816 Contributor Loading…
Add ORPOTrainer tests to align coverage with DPO
#6229 opened Jul 1, 2026 by DaoyuanLi2816 Contributor Loading…
Fix vLLM server-mode generation in OnlineDPOTrainer
#6228 opened Jun 30, 2026 by qgallouedec Member Loading…
ProTip! Updated in the last three days: updated:>2026-06-29.