Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于流水线并行的一个问题 #173

Open
Cheung-Z opened this issue May 14, 2024 · 2 comments
Open

关于流水线并行的一个问题 #173

Cheung-Z opened this issue May 14, 2024 · 2 comments

Comments

@Cheung-Z
Copy link

Cheung-Z commented May 14, 2024

作者您好,
最近在跑流水线并行的代码,并尝试在qwen上复现,但发现一个问题,pipemodel是基于basemodel来加载的,以7B模型为例,相当于内存中最多的时候需要加载2 * 14g,是这样吗?(虽然后续使用了del model可以释放出内存
感觉想在8 * 32G v100上跑qwen 32B的希望破灭了....

@realRav3n
Copy link

你好,请问你有解决这个问题吗?

@yuanzhoulvpi2017
Copy link
Owner

我刚上应该也就是放出一个最简单的demo~所以写的不充分~ 现在这个问题,也还没有解决。

  1. 如果真的想要训练,建议直接使用deepspeed -zero2就行了。或者fsdp。这些都是有现成的方案~而且更简单

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants