Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some problem about train from cityscape to foggy cityscape #25

Open
DanZhang123 opened this issue Nov 25, 2019 · 17 comments
Open

some problem about train from cityscape to foggy cityscape #25

DanZhang123 opened this issue Nov 25, 2019 · 17 comments

Comments

@DanZhang123
Copy link

Hello, when I train on cityscape-->foggy cityscape, I meet the problem of RPN regression box loss becoming NaN, What should I do?
Looking forward to your reply, thank you very much!

@Ronales
Copy link

Ronales commented Dec 23, 2019

@DanZhang123 Can you share google drive link for the foggy cityscape? I have found a long time for this dataset.Thanks in advance!

@DanZhang123
Copy link
Author

@Ronales
Copy link

Ronales commented Dec 27, 2019

@DanZhang123 but I can't download foggy cityscape dataset.no download to link.

@Ronales
Copy link

Ronales commented Dec 27, 2019

@DanZhang123 I get it, Can you tell me whether that's full foggy version for detection? Thanks in advance!

@DanZhang123
Copy link
Author

we use the processed foggy cityscape version, which link is https://drive.google.com/file/d/1mA0L5-1U_Vo-S8-cv12QBmhgG9FFf6nf/view?usp=sharing

@bill987
Copy link

bill987 commented Mar 25, 2020

@DanZhang123 I meet the problem of RPN regression box loss becoming NaN too. Had you solved it?

@edwardaaa
Copy link

@bill987 I meet the same problem, I tried to clip the gradient when training, but the performance on target data is poor. Have U solve the problem?

@bill987
Copy link

bill987 commented Apr 2, 2020

@edwardaaa
The issue was due to some bounding box annotations with very small width and height.

This resolved it.

not_keep = (gt_boxes[:,2] - gt_boxes[:,0]) < 10 | (gt_boxes[:,3] - gt_boxes[:,1]) < 10

@edwardaaa
Copy link

@bill987 Thanks a lot! I resovled it by make flipped False. Now I meet a new problem.

The Net_D and Net_D_Pixel does not work, the loss for d do not go down.

Have you met the same problem? Thank you!

@Tomlk
Copy link

Tomlk commented Jun 2, 2020

@edwardaaa
The issue was due to some bounding box annotations with very small width and height.

This resolved it.

not_keep = (gt_boxes[:,2] - gt_boxes[:,0]) < 10 | (gt_boxes[:,3] - gt_boxes[:,1]) < 10

where is this sentence should I change ? please!

@edwardaaa
Copy link

What I do is to make cfg.TRAIN.USE_FLIPPED=False

@Tomlk
Copy link

Tomlk commented Jun 13, 2020

What I do is to make cfg.TRAIN.USE_FLIPPED=False

thank you , another quesetion :the result of cityspace ->foggy_cityspace is OK?

@edwardaaa
Copy link

What I do is to make cfg.TRAIN.USE_FLIPPED=False

thank you , another quesetion :the result of cityspace ->foggy_cityspace is OK?

I tried pascal->clipart and the result is not OK.
MAP is lower than paper.

@Daipuwei
Copy link

I have a question .why this code can't load gt_box.
``Loaded dataset cityscape_trainval for training
Set proposal method: gt
Appending horizontally-flipped training examples...
cityscape_trainval gt roidb loaded from /home/yulin/daipuwei_code/DA_Detection-master/DA_Detection-master/data/cache/cityscape_trainval_gt_roidb.pkl
done
Preparing training data...
done
before filtering, there are 3998 images...
after filtering, there are 3998 images...
Loaded dataset `cityscape_car_trainval` for training
Set proposal method: gt
Appending horizontally-flipped training examples...
cityscape_car_trainval gt roidb loaded from /home/yulin/daipuwei_code/DA_Detection-master/DA_Detection-master/data/cache/cityscape_car_trainval_gt_roidb.pkl
done
Preparing training data...
done
before filtering, there are 11226 images...
after filtering, there are 11226 images...
3998 source roidb entries
11226 target roidb entries
Loading pretrained weights from /home/yulin/daipuwei_code/DA_Detection-master/DA_Detection-master/data/pretrained_model/resnet101_caffe.pth
tensor([])
tensor([])
Traceback (most recent call last):
File "trainval_net_global_local.py", line 180, in
data_s = next(data_iter_s)
File "/home/yulin/.conda/envs/dpw/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 264, in next
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/yulin/.conda/envs/dpw/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 264, in
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/yulin/daipuwei_code/DA_Detection-master/DA_Detection-master/lib/roi_data_layer/roibatchLoader.py", line 192, in getitem
not_keep = (gt_boxes[:,2] - gt_boxes[:,0]) < 10 | (gt_boxes[:,3] - gt_boxes[:,1]) < 10
IndexError: too many indices for tensor of dimension 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "trainval_net_global_local.py", line 183, in
data_s = next(data_iter_s)
File "/home/yulin/.conda/envs/dpw/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 264, in next
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/yulin/.conda/envs/dpw/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 264, in
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/yulin/daipuwei_code/DA_Detection-master/DA_Detection-master/lib/roi_data_layer/roibatchLoader.py", line 192, in getitem
not_keep = (gt_boxes[:,2] - gt_boxes[:,0]) < 10 | (gt_boxes[:,3] - gt_boxes[:,1]) < 10
IndexError: too many indices for tensor of dimension 1
``

@Vaishnvi
Copy link

we use the processed foggy cityscape version, which link is https://drive.google.com/file/d/1mA0L5-1U_Vo-S8-cv12QBmhgG9FFf6nf/view?usp=sharing

Hi @DanZhang123 ,

Thank you for the work and code. Its very helpful. Can you please confirm once for how many epochs was the model trained for cityscapes to foggycityscapes DA? I couldnt see those details in the paper. Also in paper its mentioned 70k iterations but in code its 100k iterations per epoch.

Thanks and regards,
Vaishnavi Khindkar

@fosterr
Copy link

fosterr commented Feb 14, 2021

Maybe somebody needs this info,

not_keep = (gt_boxes[:,2] - gt_boxes[:,0]) < 10 | (gt_boxes[:,3] - gt_boxes[:,1]) < 1

throws:
TypeError: unsupported operand type(s) for |: 'int' and 'Tensor'
at my machine.
Changed it to:

not_keep = ((gt_boxes[:,2] - gt_boxes[:,0]) < 10) | ((gt_boxes[:,3] - gt_boxes[:,1]) < 10)

The line is located in lib/roi_data_layer/roibatchLoader.py

Trained for half an hour and didn't get a nan yet.

@sysuzgg
Copy link

sysuzgg commented Nov 27, 2022

Maybe somebody needs this info,

not_keep = (gt_boxes[:,2] - gt_boxes[:,0]) < 10 | (gt_boxes[:,3] - gt_boxes[:,1]) < 1

throws: TypeError: unsupported operand type(s) for |: 'int' and 'Tensor' at my machine. Changed it to:

not_keep = ((gt_boxes[:,2] - gt_boxes[:,0]) < 10) | ((gt_boxes[:,3] - gt_boxes[:,1]) < 10)

The line is located in lib/roi_data_layer/roibatchLoader.py

Trained for half an hour and didn't get a nan yet.

@fosterr After the modification like your answer, the test result is 0. Is the test result correct after your training?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants