New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

model.generate的参数在yaml中设定无效，我设了do_sample: false，使用profiler查看实际还是true 此问题只在训练中途的eval发生，训练结束的最后一次eval正常 #5444

Open

1 task done

aliencaocao opened this issue Sep 15, 2024 · 0 comments · May be fixed by #5451

Labels

pending

Contributor

aliencaocao commented Sep 15, 2024 •

edited

Loading

Reminder

I have read the README and searched the existing issues.

System Info

llamafactory version: 0.9.1.dev0
Platform: Linux-6.5.0-27-generic-x86_64-with-glibc2.35
Python version: 3.10.12
PyTorch version: 2.4.1+cu124 (GPU)
Transformers version: 4.45.0.dev0
Datasets version: 2.21.0
Accelerate version: 0.34.2
PEFT version: 0.12.0
TRL version: 0.9.6
GPU type: Tesla V100-PCIE-32GB
DeepSpeed version: 0.15.1
Bitsandbytes version: 0.43.3

Reproduction

在 qwen2_vl.yaml中，我写了：

do_sample: false
max_new_tokens: 512

实际使用py-spy查看传入的参数的时候，显示do_sample=true, 没有max_new_tokens，而max_len是我的cut off len.实际上我是想限制生成长度。

根据transformers源代码
https://github.com/huggingface/transformers/blob/8bd2b1e8c23234cd607ca8d63f53c1edfea27462/src/transformers/generation/utils.py#L2967
_sample此时应该已经是false了

经过多次试验，这个参数只有在训练完成后的最后一次eval才会正确传递，训练中途的所有eval都不会

Expected behavior

正确传入model.generate参数

Others

No response

The text was updated successfully, but these errors were encountered:

github-actions bot added the pending label

aliencaocao changed the title ~~model.generate的参数在yaml中设定无效，我设了do_sample: false，使用profiler查看实际还是true~~ model.generate的参数在yaml中设定无效，我设了do_sample: false，使用profiler查看实际还是true 此问题只在训练中途的eval发生，训练结束的最后一次eval正常

aliencaocao linked a pull request

that will close this issue

Correctly pass gen_kwarg to eval during model runs #5451

Open

2 tasks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment