-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuda memory out #3
Comments
What command did you execute?
|
i use the command like :!python train.py --data_dir ./data/appa-real-release --tensorboard tf_log
i will the new command you refer to . thank u .
…------------------ 原始邮件 ------------------
发件人: "Yusuke Uchida"<[email protected]>;
发送时间: 2019年7月2日(星期二) 中午11:52
收件人: "yu4u/age-estimation-pytorch"<[email protected]>;
抄送: "clio"<[email protected]>;"Author"<[email protected]>;
主题: Re: [yu4u/age-estimation-pytorch] cuda memory out (#3)
What command did you execute?
How about decreasing the batch size like this?
python train.py --data_dir [PATH/TO/appa-real-release] TRAIN.BATCH_SIZE 16
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
You just need to choose a suitable batch size for your GPU memory. If you have 8 x K80 you will definitely have enough memory but not enough for 1 x K80. Just try the |
thanks Sacha i already fix this the original size was 128 so we should change the size before run according to our GPU.
…---Original---
From: "Sacha"<[email protected]>
Date: Thu, Oct 3, 2019 14:21 PM
To: "yu4u/age-estimation-pytorch"<[email protected]>;
Cc: "cliobeach"<[email protected]>;"Author"<[email protected]>;
Subject: Re: [yu4u/age-estimation-pytorch] cuda memory out (#3)
You just need to choose a suitable batch size for your GPU memory.
Example: GTX 660 has only 2GB and can only do batch size of 4. If you have two cards you can use the --multi_gpu flag and increase batch size to 8.
The default batch size is quite high (128) and probably requires 64GB of memory.
If you have 8 x K80 you will definitely have enough memory.
Just try the --multi_gpu flag.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
no matter sever i chose with k80 or v100 or k80*8 but still the same error like:
this is the error when i use v100
CUDA out of memory. Tried to allocate 98.00 MiB (GPU 0; 15.75 GiB total capacity; 14.64 GiB already allocated; 36.88 MiB free; 20.85 MiB cached)
how can i fix this
The text was updated successfully, but these errors were encountered: