-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clipping thresholds to input range + 1? #978
Comments
See this related issue: #875 Doesn't seem like the clipping issue itself was ever resolved. |
Hm, the proposed |
I actually faced the same issue today when trying to build the ResNet50 from the examples repository, so the clipping issue is still there and has to be worked around in the RoundAndClipThresholds transformation. |
Some new insights on this one:
So I started digging a bit and nothing seemed to make any sense at first: See here, for example, close to where the Mobilenet test fails, after removing the +1 from the range: finn/src/finn/custom_op/fpgadataflow/thresholding.py Lines 136 to 137 in 766e9fb
thresholds and threshold_tensor should contain exactly the same values and differ only in shape or by repeating these values, i.e., get_hw_compatible_threshold_tensor can be seen as a more fancy reshape operation. But they actually differ in the values they contain, in particular in the maximum and minimum, i.e., the range of values they contain. Next, the minimized DataType gets derived from the thresholds but this is then tested against the threshold_tensor , causing the assertion to fire as the ranges do not align. However, I do not see anything inherently wrong with the minimize_accumulator_width method (though it seems slightly odd to use the get_hw_compatible_threshold_tensor here at all, but apparently this helps spotting these weird types of errors).
Next step was to figure out why these two tensors end up with different values/value ranges: First hint was the "container" data type of these tensors - normally FINN stores most parameter tensors (even the low bit-width, quantized integer ones) in float32 containers. However, somehow the thresholds initializer ended up as float64 and at https://github.com/fastmachinelearning/qonnx/blob/c5bd87f05e0b6ac6bc578d38161e7dceec1077d9/src/qonnx/util/basic.py#L135 called by the Why should we care about this for supposedly integer thresholds? Well, large integers cannot be exactly represented in floating-point and the largest integer (of a given bit-width) which can be represented depends on the floating-point precision. See for example the largest signed 32-bit integer: While And this lead me back to the
For now, to not interfere with the rest of FINN and qonnx, I propose to explicitly cast the thresholds after rounding and clipping (back) to float32. It could also be an option to consider float64 or int64 (avoiding the representability issue) as a "container" data type for (large) integer thresholds, but this would require to carefully check all parts of FINN and qonnx currently assuming float32 here. I will soon open up a PR with a reworked |
I am facing the
AssertionError: Thresholds can't be expressed with type INT8
(the assertion is here in the weight file generation ofThresholding_Batch
), which I cannot really explain, at least I cannot see how I end up with a threshold of 128 at the upper bound, except for these lines in theRoundAndClipThresholds
transformation:finn/src/finn/transformation/streamline/round_thresholds.py
Lines 59 to 66 in f2424e7
I wonder whether clipping thresholds to the range
(idtype.min() - 1, idtype.max() + 1)
, under the condition that at least one threshold is outside of this range, still guarantees that these thresholds are outside of the range which can be represented withidtype
? Why is it clipped to one below/above the valid range? Is there any reasoning which I fail to see? Maybe clipping to(idtype.min(), idtype.max())
would be correct?The text was updated successfully, but these errors were encountered: