Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda memory out #3

Open
cliobeach opened this issue Jul 2, 2019 · 4 comments
Open

cuda memory out #3

cliobeach opened this issue Jul 2, 2019 · 4 comments

Comments

@cliobeach
Copy link

no matter sever i chose with k80 or v100 or k80*8 but still the same error like:
this is the error when i use v100

CUDA out of memory. Tried to allocate 98.00 MiB (GPU 0; 15.75 GiB total capacity; 14.64 GiB already allocated; 36.88 MiB free; 20.85 MiB cached)

how can i fix this

@yu4u
Copy link
Owner

yu4u commented Jul 2, 2019

What command did you execute?
How about decreasing the batch size like this?

python train.py --data_dir [PATH/TO/appa-real-release] TRAIN.BATCH_SIZE 16

@cliobeach
Copy link
Author

cliobeach commented Jul 2, 2019 via email

@xsacha
Copy link

xsacha commented Oct 3, 2019

You just need to choose a suitable batch size for your GPU memory.
Example: GTX 660 has only 2GB and can only do batch size of 4. If you have two cards you can use the --multi_gpu flag and increase batch size to 8.
The default batch size is quite high (128) and probably requires 64GB of memory.

If you have 8 x K80 you will definitely have enough memory but not enough for 1 x K80.

Just try the --multi_gpu flag.

@cliobeach
Copy link
Author

cliobeach commented Oct 6, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants