Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different size of test images #16

Open
Farah189 opened this issue Dec 1, 2019 · 15 comments
Open

Different size of test images #16

Farah189 opened this issue Dec 1, 2019 · 15 comments

Comments

@Farah189
Copy link

Farah189 commented Dec 1, 2019

Hi,
The trained network detects objects only in those images which have the same dimensions as of the training images. Can it be modified to detect objects in test images with different dimensions?

@javiribera
Copy link
Owner

This problem occurs because the linear layers are created during training depending on the size of the input training images (see Figure 3 of the paper). If you then input testing images of a different size, the size of the features maps before the linear layers won't match the number of neurons of the linear layers. I guess you could easily fix this by replacing the linear layers with global average pooling layers.

@Farah189
Copy link
Author

Farah189 commented Dec 2, 2019

Yes, it gives the error of different sizes of expected feature maps before the linear layer. I will try using the suggested pooling layer.
I will close this issue in few days after trying this layer. Thanks

@Farah189 Farah189 closed this as completed Dec 6, 2019
@javiribera
Copy link
Owner

Could you post how you fixed this issue? The code may be helpful for other people.
Thank you,

@Farah189
Copy link
Author

The use of a global average pooling layer has eliminated the error of mismatched size that it gives at the FC layer. But it is not estimating the number of objects correctly which will, in turn, affect the GMM. I have to work around a little bit more to figure it out.

@Farah189
Copy link
Author

Hi,
I have tried GAP on the concatenated innermost and last layer. I have also tried it on only the last layer. But every time it gives the estimated count value to be less than 1.
I have also tried using GAP+ReLU instead of FC+SoftPlus but it still gives the same problem. So, the use of GAP partially solves this problem. Or maybe I am not using it correctly. This is just an update about this problem.

@javiribera
Copy link
Owner

None of the options you said you tried restricts the activation values to be below 1, so this sounds like a software bug. Please post your unet_model.py file so that we can know exactly what layers you modified.

@javiribera javiribera reopened this Jan 13, 2020
@Farah189
Copy link
Author

Farah189 commented Jan 13, 2020

GAP actually takes the average of the whole layer. So, the average of the last 256x256 probability map will definitely be below 1 as there will be only few activation points (as per my understanding). Multiplying it with the dimension of the image will still not solve the problem, because the estimate will be way larger than it should be.
I am still using the network that you presented in your first paper of UNet. You can adjust the dimension of the GAP according to the dimension of your images.

@Farah189
Copy link
Author

Kindly let me know if you are checking it. I am just a bit curious if I was right or wrong about the problem that occurs with the use of GAP layer.

@javiribera
Copy link
Owner

So your current problem is that with the unet_model.py you posted above you always get the estimated count value to be less than 1?
Also I'm going to need you to include all the information requested here: https://github.com/javiribera/locating-objects-without-bboxes#creating-an-issue

@Farah189
Copy link
Author

I do not want you to reproduce the same thing as it seems to be a wastage of time. I just need your opinion if I am right about this thing:
"GAP actually takes the average of the whole layer. So, the average of the last 256x256 probability map will definitely be below 1 as there will be only few activation points (as per my understanding)."
I studied it before using. So, correct me if my understanding is wrong as you are more experienced.

@javiribera
Copy link
Owner

I think your intuition is correct. In fact, I remember I once tried something similar and experimented with a GAP layer connected to the probablility map. The difference is that I also multiplied the result with a constant, which can be trained. I did not get better results than in my paper but you can further experiment with this since your problem is different (mismatching tensor sizes).

Also note that in the unet_model.py you attached above, your GAP is applied to the concatenation of both the probability map and the activations of the lateral network. It is hard to get an intuition for this case.

@Farah189
Copy link
Author

I have tried multiplying with a constant too. And I have also tried using only the probability layer. But then I read the GAP in detail to understand it and get to know that it will not solve the problem.
I will keep on trying to solve this issue. Thanks a lot for your guidance. It helped me understand more things about network structure.

@danielyyt
Copy link

does the code supports now different size of the test images?

@Farah189
Copy link
Author

does the code supports now different size of the test images?

I have not checked it again. But I think that most of the fully connected networks (U-Net in this case), have this limitation to support the same size for training and testing images.

@javiribera
Copy link
Owner

does the code supports now different size of the test images?

No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants