Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练时有足够内存,依然得到out of memory问题 #21

Open
montensorrt opened this issue Sep 7, 2021 · 1 comment
Open

训练时有足够内存,依然得到out of memory问题 #21

montensorrt opened this issue Sep 7, 2021 · 1 comment

Comments

@montensorrt
Copy link

您好,首先感谢您分享了您的稠密目标定位算法,最近我对您得算法进行实现得时候出现了问题,如下:
首先我是在windows10平台上运行得训练过程,前面数据和模型处理好了以后,开始训练出现问题如下:
不论运行那个模型,当模型开始正向执行,总是在第一个conv出爆出问题:
VGG16_FPN.py:
def forward(self, x):
f = []
x = self.layer1(x)
seg_hrnet.py:
def forward(self, x):
residual = x
out = self.conv1(x)
总是爆出内存不足问题,
RuntimeError: CUDA out of memory. Tried to allocate 4.50 GiB (GPU 0; 12.00 GiB total capacity; 886.66 MiB already allocated; 5.14 GiB free; 4.94 GiB reserved in total by PyTorch)
但是其中它想去分配4.5G,(Tried to allocate 4.50 GiB),而我的电脑除了pytorch占用得,还有5G(5.14 GiB free; 4.94 GiB reserved in total by PyTorch).
目前不知道是哪里除了问题,请问您原始代码是在那个平台训练得,ubuntu吗?或者您这边有什么思路可以解决此问题吗?
期待您得回复,万分感谢!

@montensorrt montensorrt changed the title 训练时有足够得内存,依然得到out of memory问题 训练时有足够内存,依然得到out of memory问题 Sep 7, 2021
@taohan10200
Copy link
Owner

Tips: The training process takes ~50 hours on NWPU datasets with two TITAN RTX (48GB Memeory).

您好,我们是在ubuntu上训练的,按照这个代码预设的batchsize=8的话,需要的显存比较大,如果你只有一张12G的显卡的话,建议可以把batchsize减小或者图片crop的size减小

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants