-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix ppo_freeze mat1 mat2 should have the same dtype #5480
base: main
Are you sure you want to change the base?
fix ppo_freeze mat1 mat2 should have the same dtype #5480
Conversation
请提供详细一点的信息,目前这个 PR 会造成 VRAM 显著上升 |
这种配置文件下,运行的时候会报错 改完之后可以继续训练 |
debug的时候, |
已提供,大神看一下呢 |
@@ -133,6 +133,25 @@ def _setup_freeze_tuning( | |||
else: | |||
param.requires_grad_(False) | |||
|
|||
'''在使用ppo_freeze的时候,model为qwen1.5-0.5b的时候,第一次load actor model的时候添加数据类型的转换 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we change to english? :)
What does this PR do?
Fixes # (issue)
Before submitting