Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probably a more reasonable method of packing #2466

Open
AIR-hl opened this issue Dec 12, 2024 · 1 comment
Open

Probably a more reasonable method of packing #2466

AIR-hl opened this issue Dec 12, 2024 · 1 comment
Labels
✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity 🙋 help from community wanted Open invitation for community members to contribute 🏋 SFT Related to SFT

Comments

@AIR-hl
Copy link
Contributor

AIR-hl commented Dec 12, 2024

Feature request

Hi! Here i m again : ) , according to #1850 i noticed that the packing method of SFTTrainer truncates overly long data directly and roughly, it's obviously that this will cause a lot of text information to be destroyed.

When browsing the codes of LLaMA-Factory, i find they use a different elegant way for packing. In short, they use the greedy knapsack algorithm to fill each packing sequence as much as possible, maximizing its utilization.

Maybe trl can be modified accordingly in the future. What's ur idea?

Motivation

https://github.com/hiyouga/LLaMA-Factory/blob/main/src/llamafactory/data/processors/supervised.py#L130

@qgallouedec
Copy link
Member

That's very interesting! It would be a nice improvement.

If you want to tackle this problem, you should be aware that packing will be implemented differently (in a simpler way) in the near future, see #2405. You should branch from there.

@qgallouedec qgallouedec added ✨ enhancement New feature or request 🙋 help from community wanted Open invitation for community members to contribute 🏋 SFT Related to SFT 🧒 good second issue Good for contributors with basic project familiarity labels Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity 🙋 help from community wanted Open invitation for community members to contribute 🏋 SFT Related to SFT
Projects
None yet
Development

No branches or pull requests

2 participants