Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[NashMD] fix the edge case where the model is a peft model
#3473 opened May 20, 2025 by kashif Loading…
5 tasks
Allow an user to train from a local dataset
#3470 opened May 19, 2025 by gogo2464 Loading…
1 of 5 tasks
add support for image inputs in GRPO
#3460 opened May 16, 2025 by hellopahe Loading…
LD-DPO support
#3458 opened May 16, 2025 by AIR-hl Loading…
1 of 5 tasks
Update grpo.py to fix bugs for cli grpo --reward_funcs my_lib.my_reward
#3454 opened May 16, 2025 by wa008 Loading…
4 of 5 tasks
[SFT] add warning if dataset's input_ids exceed max_length
#3449 opened May 15, 2025 by HERIUN Loading…
1 of 5 tasks
Fix logging docs
#3447 opened May 14, 2025 by xingyaoww Draft
2 of 5 tasks
🛠️ quantization support for vllm generation
#3428 opened May 8, 2025 by shirinyamani Loading…
5 tasks
Reintroducing step method in ppo_trainer
#3410 opened May 3, 2025 by jskaf34 Loading…
2 of 5 tasks
fix setup chat format
#3404 opened May 2, 2025 by qgallouedec Draft
5 tasks
[DPO] Truncation leading to zero'd out samples
#3398 opened May 1, 2025 by LeonEricsson Loading…
2 of 5 tasks
Reintroduce generate method for PPOTrainer
#3374 opened Apr 27, 2025 by CloseChoice Loading…
4 tasks done
An Unified Example Format Checker
#3373 opened Apr 27, 2025 by innerNULL Loading…
1 of 5 tasks
add support for reward func using nn.Module in GRPOTrainer
#3372 opened Apr 27, 2025 by Tavish9 Loading…
1 of 5 tasks
[Feat] Suppport SGLang as rollout engine of GRPO trainer
#3370 opened Apr 27, 2025 by ryang-max Loading…
2 of 8 tasks
Environments
#3367 opened Apr 26, 2025 by August-murr Draft
add vllm support for token ids as input
#3280 opened Apr 11, 2025 by wybryan Loading…
ProTip! Follow long discussions with comments:>50.