-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to train on multi GPU? #93
Comments
can you tell me which line the error is reported |
When I run: It appears that: |
运行多卡时报错,求解决 python scripts/segmentation_train.py --data_name NC2016 --data_dir "/PublicFile/xp_data/NC2016/" --out_dir "./results/NC2016/trainv1" --image_size 256 --num_channels 128 --class_cond False --num_res_blocks 2 --num_heads 1 --learn_sigma True --use_scale_shift_norm False --attention_resolutions 16 --diffusion_steps 1000 --noise_schedule linear --rescale_learned_sigmas False --rescale_timesteps False --lr 1e-4 --batch_size 8 --multi_gpu 0,1,2 training... |
some part of your module is on different GPUs. Did you meet the same error running on example dataset? if it has no problem on example cases, then the problem is in your data loading process. |
我遇到了一模一样的错误,请问您解决了嘛? |
When I use --multi_gpu 0,1,2, it has a error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!
And how to change the code?
Thanks!
The text was updated successfully, but these errors were encountered: