Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sometimes fails to meet pre_nms_topk with only two classes #23

Open
td-anne opened this issue Jul 8, 2023 · 4 comments
Open

Sometimes fails to meet pre_nms_topk with only two classes #23

td-anne opened this issue Jul 8, 2023 · 4 comments

Comments

@td-anne
Copy link

td-anne commented Jul 8, 2023

I am running DETA on a data set with only one real class (and one N/A class; in particular various tensors are n by 2). In some long runs, the run fails with RuntimeError: selected index k out of range at the line below:

pre_nms_inds.append(torch.topk(prop_logits_b.sigmoid() * lvl_mask, pre_nms_topk)[1])

If I understand correctly, this should only be failing if the number k requested from topk, in this case pre_nms_topk, which is 1000, is too small; specifically I believe this can only happen if the length of the lvl_mask is less than 1000. (Perhaps my data augmentation has produced an unreasonably tiny image? I thought they were all rescaled.) I don't really understand where we are in the code when this occurs, but would it be harmful to trim the k supplied to topk down to the available length?

@td-anne
Copy link
Author

td-anne commented Jul 10, 2023

In fact I think I may know what has happened. First, I have set the input image rescaling to at most 800 for the longest side (1333 overflows my GPU RAM when images need to be padded out to 1333x1333). Second, my image augmentation (using albumentations.BBoxSafeRandomCrop) may, rarely, produce one-pixel-wide images. If these are rescaled to produce 800x1 images, then there aren't more than 800 values in lvl_mask. Does this sound plausible?

@jozhang97
Copy link
Owner

Yes, if you have fewer classes, it makes sense to have fewer predictions. It should be fine to change the class-agnostic topk. We tried a couple values and did not find too much of a difference.

Your 800x1 images could also be a problem. Though there could be more proposals since we have multi-level features.

You can also try out checkpointing to avoid GPU OOM.

@td-anne
Copy link
Author

td-anne commented Sep 18, 2023

The 800x1 images are, obviously, not of any use, so I don't care what values get returned as long as it doesn't crash. The checkpointing is interesting, though: could the model cope with 1920 by 1080 images? Or does that require changing the structure somewhat? My raw inputs are all 1920 by 1080 and I'm looking for broken wires, which might disappear when downscaled. For the moment I'm more interested in accuracy than speed.

@jozhang97
Copy link
Owner

I see that makes sense for high resolution. We typically use larger images during pre-training so I don't think 1920x1080 should be a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants