huggingface / trl Public

generated from fastai/nbdev_template

Notifications You must be signed in to change notification settings
Fork 2k
Star 14.6k

Code
Issues 422
Pull requests 89
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 32 Milestones 0

New pull request New

89 Open 1,692 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add basic support for FSDP/Lora when using TRL/VLLM

#3735 opened Jul 14, 2025 by ojh31

Loading…

5 tasks

Add warn0 utility and replace warnings.warn with rank-aware warnings in trainer

#3734 opened Jul 14, 2025 by yafshar

Loading…

1 of 5 tasks

[WIP] Fix ppo example accelerator initialization error

#3732 opened Jul 14, 2025 by ccs96307 • Draft

2 of 5 tasks

🏗️ Refactor top-entropy in GRPO

#3727 opened Jul 12, 2025 by qgallouedec

Loading…

Remove the negative value of KL divergence

#3710 opened Jul 9, 2025 by ENg-122

Loading…

⚰️ Remove deprecated

#3704 opened Jul 8, 2025 by qgallouedec

Loading…

5 tasks

[GRPO] Log generation entropy

#3700 opened Jul 7, 2025 by LeonEricsson • Draft

2 of 5 tasks

FSDP2+GRPO

#3687 opened Jul 3, 2025 by SalmanMohammadi

Loading…

5 tasks

Support FSDP2 in GRPOTrainer

#3670 opened Jun 30, 2025 by thepowerfuldeez

Loading…

[SFT] Dry up the sft tests

#3657 opened Jun 27, 2025 by kashif

Loading…

5 tasks

feat: Initial implementation of RePO trainer and components

#3655 opened Jun 26, 2025 by celsowm

Loading…

5 tasks

Ensure Chat Template Safe Prompt Truncation

#3646 opened Jun 25, 2025 by pramodith

Loading…

4 of 5 tasks

[WIP] vllm-server-spec-dec-support

#3643 opened Jun 24, 2025 by shirinyamani

Loading…

5 tasks

GRPO: Pack Responses within the same group.

#3642 opened Jun 24, 2025 by pramodith • Draft

4 of 5 tasks

🔍 Add guidance on choosing max_length value and include visualizati…

#3630 opened Jun 22, 2025 by qgallouedec

Loading…

5 tasks

Add Entropy Control to GRPOTrainer

#3628 opened Jun 22, 2025 by 1485840691

Loading…

Feature: Add SGLang support for GRPO Trainer

#3627 opened Jun 21, 2025 by PrinsYin • Draft

5 tasks

[WIP] [SFT] SFT doc rewrite

#3619 opened Jun 18, 2025 by qgallouedec

Loading…

5 tasks

ClearML logging of visualization in RewardTrainer evaluation

#3602 opened Jun 16, 2025 by ioverho

Loading…

2 of 5 tasks

Fix: corrected fsdp in GRPO trainer

#3582 opened Jun 13, 2025 by tryumanshow

Loading…

2 of 5 tasks

Check rewards shapes in RewardTrainer

#3577 opened Jun 13, 2025 by ioverho

Loading…

4 tasks done

Chisquare regularized DPO

#3573 opened Jun 12, 2025 by asparius

Loading…

[WIP] 🥳 new rloo

#3533 opened Jun 3, 2025 by shirinyamani

Loading…

5 tasks

Push KTAE impl

#3518 opened May 30, 2025 by SamComber

Loading…

5 tasks

intuit

#3513 opened May 29, 2025 by shirinyamani

Loading…

5 tasks

Previous 1 2 3 4 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!