2 potential bugs related to `FloatQuant` #1126

nghielme · 2024-12-11T23:21:59Z

The issue is to share with you 2 potential issues related to FloatQuant implementation:

One is related to the definition of the minimum internal scale, used to quantize the mantissa, here
I think the minimum should be 1. - self.exponent_bit_width() - self.exponent_bias() - self.mantissa_bit_width()) or a different expression but I think that self.exponent_bit_width() should be part of the expression. This can be deduced also from the way the actual internal scale is computed, here. The first part floor_ste(torch.log2(torch.abs(x) + eps)) represents in fact the exponent of the float x.
I noticed for example that 0.3125 is quantized to 0.25 using a quantizer with 4 exponent bits, 4 mantissa bits, 0 exponent bias.
The other is related to the following expression, here
I think the correct one should be -x < -max_value in order to identify the values that exceed max_value from the negative side.

I am writing a similar quantizer so any comment on the 2 listed points can also be useful for me to write it properly

The text was updated successfully, but these errors were encountered:

nickfraser · 2024-12-16T15:10:15Z

Thank you for sharing this issue - I'm currently investigating (unfortunately, while juggling some other tasks). Early days my side, but since you said you're writing similar quantization functions, I'll give you some initial feedback anyway. Please take these comments with "a grain of salt".

I noticed for example that 0.3125 is quantized to 0.25 using a quantizer with 4 exponent bits, 4 mantissa bits, 0 exponent bias.

Thanks, I am able to reproduce this and agree this is unexpected behaviour. What I don't understand is why our other tests (public and internal) haven't caught this. I plan to start addressing this by first creating a more extensive test suite.

I think the minimum should be 1. - self.exponent_bit_width() - self.exponent_bias() - self.mantissa_bit_width()) or a different expression but I think that self.exponent_bit_width() should be part of the expression.

I am not convinced of this yet, but maybe I'll change my mind as my investigation continues. ;)

nickfraser · 2024-12-16T15:17:42Z

I think the correct one should be -x < -max_value in order to identify the values that exceed max_value from the negative side.

I'm not following your logic here. The goal if the mask n_max_val_mask is identify values of x that are less than the highest magnitude representable negative value in the float representation. Your expression doesn't do this - your mask will be identical to p_max_val_mask. However, I do believe our expression should actually be x < min_value to capture the unsigned case.

nghielme · 2024-12-18T09:05:02Z

Hi Nick,
Thanks for you comments.

I think the minimum should be 1. - self.exponent_bit_width() - self.exponent_bias() - self.mantissa_bit_width()) or a different expression but I think that self.exponent_bit_width() should be part of the expression.

I am not convinced of this yet, but maybe I'll change my mind as my investigation continues. ;)

I agree, in the meantime I realised that the minimum value that exponent can represent is - self.exponent_bias().

nickfraser · 2024-12-19T16:48:29Z

I'm going to contradict myself and say that I believe our minifloat quantizers are behaving as expected.

I noticed for example that 0.3125 is quantized to 0.25 using a quantizer with 4 exponent bits, 4 mantissa bits, 0 exponent bias.

I don't believe 0.3125 can be represented with that format, so value of 0.25 seems reasonable here.

I have generated quite an extensive test to "kick the tyres" of our minifloat formats in #1136. @nghielme, if you still unsure, could you check the logic here and tell me if I'm mistaken. If you think there are values that can be represented by a format but not by Brevitas, can you specify the values of the mantissa and exponent in an unsigned format? I believe an extra mantissa bit, or an exponent bias of 1 would be required to represent that value.

the minimum value that exponent can represent is - self.exponent_bias().

The minimum effective value of the exponent is 1. - self.exponent_bias() as an exponent of 0 simply marks that the mantissa is denormalised. I believe this is exactly the source of the confusion.

nickfraser · 2024-12-20T14:48:20Z

Closing - please reopen or create a new issue if you still think this is an issue.

nghielme added the bug Something isn't working label Dec 11, 2024

nickfraser mentioned this issue Dec 16, 2024

Fix (float/clamp): Bugfix when unsigned #1132

Merged

nickfraser mentioned this issue Dec 19, 2024

test (core/float): Enhanced testing of minifloat formats #1136

Merged

8 tasks

nickfraser closed this as completed Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2 potential bugs related to `FloatQuant` #1126

2 potential bugs related to `FloatQuant` #1126

nghielme commented Dec 11, 2024 •

edited

Loading

nickfraser commented Dec 16, 2024

nickfraser commented Dec 16, 2024 •

edited

Loading

nghielme commented Dec 18, 2024

nickfraser commented Dec 19, 2024

nickfraser commented Dec 20, 2024

2 potential bugs related to FloatQuant #1126

2 potential bugs related to FloatQuant #1126

Comments

nghielme commented Dec 11, 2024 • edited Loading

nickfraser commented Dec 16, 2024

nickfraser commented Dec 16, 2024 • edited Loading

nghielme commented Dec 18, 2024

nickfraser commented Dec 19, 2024

nickfraser commented Dec 20, 2024

2 potential bugs related to `FloatQuant` #1126

2 potential bugs related to `FloatQuant` #1126

nghielme commented Dec 11, 2024 •

edited

Loading

nickfraser commented Dec 16, 2024 •

edited

Loading