generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fixing
SFTTrainer.compute_loss
crash with accelerate
#3048
opened Mar 10, 2025 by
jamesbraza
Loading…
Passing custom BOS/EOS token to
GPROTrainer.generation_config
#3046
opened Mar 10, 2025 by
jamesbraza
Loading…
Fixing JSD loss computation in GKDTrainer as per definition
#3043
opened Mar 10, 2025 by
abhigoyal1997
Loading…
fix temperature inconsistency in GRPO trainer
#3029
opened Mar 8, 2025 by
Aladoro
Loading…
1 of 5 tasks
Fixing GRPO
reward_func
being a model with DeepSpeed ZeRO-3
#2984
opened Feb 28, 2025 by
jamesbraza
Loading…
Feature: Add SGLang as inference backend for generation in GRPO
#2981
opened Feb 28, 2025 by
jhinpan
Loading…
5 tasks done
Provide more accurate error messages to make the program more robust.
#2932
opened Feb 22, 2025 by
dignfei
Loading…
4 tasks
Add the metrics completion_length_max and completion_length_min
#2930
opened Feb 22, 2025 by
dignfei
Loading…
4 tasks
Remove CUDA synchronization in mean_token_accuracy
#2902
opened Feb 19, 2025 by
cyyever
Loading…
1 task done
[Discussion] Agentic Framework Based on VLLM and E2B for RL
#2880
opened Feb 17, 2025 by
August-murr
•
Draft
[GRPO] Reduce steps where loss starts to remain at 0, accelerate training
#2869
opened Feb 15, 2025 by
zhangsheng377
Loading…
Using model_wrapped can improve the generation speed by approximately 10 times on a single GPU.
#2859
opened Feb 14, 2025 by
dignfei
Loading…
[draft] Use vLLM in LogCompletionsCallback
#2797
opened Feb 7, 2025 by
tchang1997
•
Draft
2 of 4 tasks
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.