Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding Multidevice Strategy #81

Open
gateway opened this issue Oct 13, 2020 · 7 comments
Open

Understanding Multidevice Strategy #81

gateway opened this issue Oct 13, 2020 · 7 comments

Comments

@gateway
Copy link

gateway commented Oct 13, 2020

I have been trying to figure out how to max out both of my gpu's that are in my system.

Tue Oct 13 15:15:00 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  TITAN RTX           Off  | 00000000:01:00.0 Off |                  N/A |
| 41%   41C    P8    15W / 280W |    292MiB / 24220MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 00000000:02:00.0 Off |                  N/A |
| 21%   50C    P8     6W / 180W |      2MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2061      G   /usr/lib/xorg/Xorg                191MiB |
|    0   N/A  N/A      2745      G   ...mviewer/tv_bin/TeamViewer       13MiB |
|    0   N/A  N/A      2949      G   /usr/bin/gnome-shell               83MiB |
+-----------------------------------------------------------------------------+

GPU 0 has the most memory,

I'm trying to understand the -multidevice_strategy, how many layers are their.. its not very clear to me what would be the best for 2 gpus, one with more memory than the other.. or at least a starting off point..

I have just tried the value of 20 and this was the result..

image

@ProGamerGov
Copy link
Owner

ProGamerGov commented Oct 14, 2020

@gateway The multidevice strategy just simply splits the model layers across different devices. The layer order in a model dictates which layers go on which device, based on your specified -multidevice_strategy params. There's not really a special formula for what values to use, and you'll probably have to experiment a bit to find the best possible settings.

Edit:

This may help: https://www.reddit.com/r/deepdream/comments/dnsg65/multigpu_strategy/

@IridiumMaster
Copy link

Was trying this on two Google A100s in their cloud. Devices list below. Used the following parameters:
neural-style -multidevice_strategy 3,7,12 -gpu 0,1 -style_image myPainting63.jpg -content_image Headcrop4.jpg -model_file vgg19-d01eb7cb.pth -image_size 3000 -backend cudnn -optimizer lbfgs -num_iterations 2500 -output_image g63.png -original_colors 1
I get the error:
"The number of -multidevice_strategy layer indices minus 1, must be equal to the number of -gpu devices."

It's not clear to me what I am doing wrong here. Could you please help? I ran some other code to validate that these CUDA devices could be detected by PyTorch.

`+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 A100-SXM4-40GB Off | 00000000:00:04.0 Off | 0 |
| N/A 34C P0 56W / 350W | 0MiB / 40537MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
| 1 A100-SXM4-40GB Off | 00000000:00:05.0 Off | 0 |
| N/A 32C P0 53W / 350W | 0MiB / 40537MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
`

@ProGamerGov
Copy link
Owner

@IridiumMaster The -multidevice_strategy parameter tells the code where to slice / cut the model, and in your case 2 GPUs means you want to have the model cut into 2 pieces (one for each GPU). So, for two GPUs you should only specify one value for -multidevice_strategy.

@IridiumMaster
Copy link

@IridiumMaster The -multidevice_strategy parameter tells the code where to slice / cut the model, and in your case 2 GPUs means you want to have the model cut into 2 pieces (one for each GPU). So, for two GPUs you should only specify one value for -multidevice_strategy.

Thanks kindly, that worked very well for me.

@robertgoacher
Copy link

@ProGamerGov In neural-style-pt/examples/scripts/starry_stanford_bigger.sh you aren't using the multiple GPU setting for the lower-resolution images. Is that because there is no benefit (in speed or memory) from splitting the layers over multiple GPUs at those lower resolutions? I'm just trying to get an understand of when multiple GPUs would be best used.

@ProGamerGov
Copy link
Owner

@robertgoacher Using multiple GPUs can be a bit slower than using a single GPU. Also, there's a small increase in memory that results from using multiple GPUs as well I think.

@robertgoacher
Copy link

@ProGamerGov Thank you so much for your reply; I really appreciate it.

I think I understand this now...but please correct me if I'm wrong.

So you need to use the multiple GPU strategy for high-resolution style transfers because individual GPUs don't normally have enough memory to do the inference? If you have a GPU with lots of memory (for example a NVIDIA A100 GPU with 40GB of memory) you might be able to complete a render at a high resolution on that GPU without needing to use the multiple GPU strategy? But if you do need to use multiple GPUs you can split the processing (and therefore the memory usage) over multiple GPUs but there will be a decrease in speed and an increase in memory usage from using that strategy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants