We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
请问怎么多卡训练,我看到 distributed.py 中只设置了一张卡,我将它设置为[0,1,2,3]的时候直接报错
distributed.py
[0,1,2,3]
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
The text was updated successfully, but these errors were encountered:
我理解的,distributed.py只是对运行的线程数进行设置,实际上在dist_model = nn.DataParallel(model).cuda()中就已经设置了多卡训练。
Sorry, something went wrong.
代码只使用了一个进程,会对训练时间有影响吗?另外多张卡训练,batch_size应该设置多大呢?我在4张1080Ti上训练,设置batch_size=96,训练时间要50多个小时。
No branches or pull requests
请问怎么多卡训练,我看到
distributed.py
中只设置了一张卡,我将它设置为[0,1,2,3]
的时候直接报错The text was updated successfully, but these errors were encountered: