Probably a more reasonable method of packing
#2466
Labels
✨ enhancement
New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
🙋 help from community wanted
Open invitation for community members to contribute
🏋 SFT
Related to SFT
Feature request
Hi! Here i m again : ) , according to #1850 i noticed that the
packing
method ofSFTTrainer
truncates overly long data directly and roughly, it's obviously that this will cause a lot of text information to be destroyed.When browsing the codes of LLaMA-Factory, i find they use a different elegant way for packing. In short, they use the greedy knapsack algorithm to fill each packing sequence as much as possible, maximizing its utilization.
Maybe
trl
can be modified accordingly in the future. What's ur idea?Motivation
https://github.com/hiyouga/LLaMA-Factory/blob/main/src/llamafactory/data/processors/supervised.py#L130
The text was updated successfully, but these errors were encountered: