-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i got this error when training cbnetv2, but It's normal when I train other models #57
Comments
the config is as follows
|
Hello,have you solved this problem?i meet this problem too. |
直接替换mmdet的models文件夹,应该是这里的文件和我安装的mmdet有点不同,直接替换环境中的mmdet/models就好了 |
太感谢了老哥,困扰了许久终于解决了! |
请问您使用的pytorch和cuda 的版本是什么呢 |
torch 1.9.1 cuda 11.1 |
可以再问一下您使用的mmcv-full是什么版本不 |
1.6.1 |
谢谢!!我刚才用了跟您一样的环境跑了一遍 遇到了loss为NaN的情况 请问您有遇到过吗 |
调一下学习率,或者debug看一下哪里nan |
好的好的 谢谢! |
Traceback (most recent call last):
File "./tools/train.py", line 234, in
main()
File "./tools/train.py", line 221, in main
meta=meta)
File "/disk/sde/wzr/mmm/mmdet/apis/train.py", line 208, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/disk/sdb/wzr/.conda/envs/wzr_env2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/disk/sdb/wzr/.conda/envs/wzr_env2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/disk/sdb/wzr/.conda/envs/wzr_env2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
**kwargs)
File "/disk/sdb/wzr/.conda/envs/wzr_env2/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 59, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/disk/sde/wzr/mmm/mmdet/models/detectors/base.py", line 248, in train_step
losses = self(**data)
File "/disk/sdb/wzr/.conda/envs/wzr_env2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/disk/sdb/wzr/.conda/envs/wzr_env2/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 140, in new_func
output = old_func(*new_args, **new_kwargs)
File "/disk/sde/wzr/mmm/mmdet/models/detectors/base.py", line 172, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/disk/sde/wzr/mmm/mmdet/models/detectors/two_stage.py", line 142, in forward_train
**kwargs)
File "/disk/sde/wzr/mmm/mmdet/models/dense_heads/base_dense_head.py", line 330, in forward_train
outs = self(x)
File "/disk/sdb/wzr/.conda/envs/wzr_env2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/disk/sde/wzr/mmm/mmdet/models/dense_heads/anchor_head.py", line 169, in forward
return multi_apply(self.forward_single, feats)
File "/disk/sde/wzr/mmm/mmdet/core/utils/misc.py", line 30, in multi_apply
return tuple(map(list, zip(*map_results)))
File "/disk/sde/wzr/mmm/mmdet/models/dense_heads/rpn_head.py", line 64, in forward_single
x = self.rpn_conv(x)
File "/disk/sdb/wzr/.conda/envs/wzr_env2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/disk/sdb/wzr/.conda/envs/wzr_env2/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/disk/sdb/wzr/.conda/envs/wzr_env2/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
self.padding, self.dilation, self.groups)
TypeError: conv2d(): argument 'input' (position 1) must be Tensor, not tuple
The text was updated successfully, but these errors were encountered: