Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model.generate的参数在yaml中设定无效,我设了do_sample: false,使用profiler查看实际还是true 此问题只在训练中途的eval发生,训练结束的最后一次eval正常 #5444

Open
1 task done
aliencaocao opened this issue Sep 15, 2024 · 0 comments · May be fixed by #5451
Labels
pending This problem is yet to be addressed

Comments

@aliencaocao
Copy link
Contributor

aliencaocao commented Sep 15, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

  • llamafactory version: 0.9.1.dev0
  • Platform: Linux-6.5.0-27-generic-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • PyTorch version: 2.4.1+cu124 (GPU)
  • Transformers version: 4.45.0.dev0
  • Datasets version: 2.21.0
  • Accelerate version: 0.34.2
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • GPU type: Tesla V100-PCIE-32GB
  • DeepSpeed version: 0.15.1
  • Bitsandbytes version: 0.43.3

Reproduction

在 qwen2_vl.yaml中,我写了:

do_sample: false
max_new_tokens: 512

实际使用py-spy查看传入的参数的时候,显示do_sample=true, 没有max_new_tokens,而max_len是我的cut off len.实际上我是想限制生成长度。
image

根据transformers源代码
https://github.com/huggingface/transformers/blob/8bd2b1e8c23234cd607ca8d63f53c1edfea27462/src/transformers/generation/utils.py#L2967
_sample此时应该已经是false了

经过多次试验,这个参数只有在训练完成后的最后一次eval才会正确传递,训练中途的所有eval都不会

Expected behavior

正确传入model.generate参数

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Sep 15, 2024
@aliencaocao aliencaocao changed the title model.generate的参数在yaml中设定无效,我设了do_sample: false,使用profiler查看实际还是true model.generate的参数在yaml中设定无效,我设了do_sample: false,使用profiler查看实际还是true 此问题只在训练中途的eval发生,训练结束的最后一次eval正常 Sep 16, 2024
@aliencaocao aliencaocao linked a pull request Sep 16, 2024 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant