PatchInferer with AvgMerger and filter_fn leads to NaNs #7898
Replies: 4 comments 2 replies
-
Hi @nicholas-greig, could you please share a small piece of code that I can reproduce the issue? Thanks. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
@KumoLiu bump |
Beta Was this translation helpful? Give feedback.
-
Hi @nicholas-greig, sorry for the later response. After taking a look at your code, I guess the problem is that you set the MONAI/monai/inferers/splitter.py Line 105 in 15d0771 Hope it helps, thanks. |
Beta Was this translation helpful? Give feedback.
-
Describe the bug
On master currently, when using the PatchInferer class with an AvgMerger (the default Merger class), and a filter_fn, the counts will be zero everywhere the filter_fn filters a region. Then, when the AvgMerger.finalize() is called, the self.values attr of AvgMerger is in-place divided by the self.counts tensor. This is an issue, since the self.counts tensor is initialised to zero, and div by zero causes NaNs. So, everywhere that a filter_fn successfully filters a region, we get NaN outputs.
A quick inplace assignment to counts (to set counts to 1, for example), will set all of these values to zero after this inplace division, but if the output is supposed to be real valued/continuous, it might be better to inplace overwrite these values to be the smallest value possible (using
torch.finfo(self.values.dtype).min
or something similar). Monkey patching the outputs from an Inferer isn't the best situation, since a network can produce NaNs due to weights exploding or overflow during training, and masking this with by overwriting NaNs to zero would merely obfuscate that problem.Beta Was this translation helpful? Give feedback.
All reactions