You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear authors, Happy Chinese New Year!
Can you tell me if the model supports multi-card training? When I try to use the original code to train on a machine with two A100s (using python run.py --args ...), I find that it only trains on one card by default, instead of calling two cards.
If I modify the code to just use torch.nn.DataParallel() to wrap the model for training, will there be any problem? Or is there another more appropriate way? I've tried launching run.py with accelerate launch, but while it automatically calls both cards for training, it gets multi-threading related errors at kmeans step.
Hope you can answer my questions, thank you very much!
The text was updated successfully, but these errors were encountered:
I used accelerate launch for multi-GPU training. Yes, the K-means portion does not support multi-GPU, so I switch to a single GPU for the K-means and testing parts.
Dear authors, Happy Chinese New Year!
Can you tell me if the model supports multi-card training? When I try to use the original code to train on a machine with two A100s (using
python run.py --args ...
), I find that it only trains on one card by default, instead of calling two cards.If I modify the code to just use
torch.nn.DataParallel()
to wrap themodel
for training, will there be any problem? Or is there another more appropriate way? I've tried launchingrun.py
withaccelerate launch
, but while it automatically calls both cards for training, it gets multi-threading related errors at kmeans step.Hope you can answer my questions, thank you very much!
The text was updated successfully, but these errors were encountered: