-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: improved quantization error #368
base: fix/improve-quantization-error
Are you sure you want to change the base?
fix: improved quantization error #368
Conversation
Bug Fixes
ContributorsCommit-Lint commandsYou can trigger Commit-Lint actions by commenting on this PR:
|
|
||
#ifdef ACL_PRECISION_BOOST | ||
|
||
const float max_range_value_flt = float((1 << k_segment_range_reduction_num_bits_per_component) - 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: now 254 instead of 255
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, does it mean we can never have a 255 value?
|
||
const rtm::vector4f half = rtm::vector_set(0.5F); | ||
const rtm::vector4f half_neg = rtm::vector_set(-0.5F); | ||
const rtm::vector4f half_range_value = rtm::vector_set(float((1 << (k_segment_range_reduction_num_bits_per_component - 1)) - 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: half_range_value = 127
// To get the best accuracy, we pick the values closest to the true minimum that is slightly lower, and true maximum that is slightly higher. | ||
// This is to ensure that we encompass the lowest and highest values even after quantization. | ||
const rtm::vector4f range_min = range.get_min(); | ||
const rtm::vector4f quantized_min0 = rtm::vector_clamp(rtm::vector_add(rtm::vector_floor(rtm::vector_mul_add(range_min, max_range_value, half)), half_range_value), zero, max_range_value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: range_min is already normalized in the [-0.5, 0.5] range here, we convert it back into [0.0, 254.0] range after symmetric rounding
const rtm::vector4f quantized_min1 = rtm::vector_max(rtm::vector_sub(quantized_min0, one), zero); | ||
|
||
#ifdef ACL_PRECISION_BOOST | ||
|
||
const rtm::vector4f padded_range_min0 = rtm::vector_mul_add(quantized_min0, inv_max_range_value, half_neg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: convert back into [0.0, 1.0] range and offset to [-0.5, 0.5] range
|
||
// Finally, we pad the range further so the midpoint has minimal quantization error. | ||
const rtm::vector4f quantized_extent = rtm::vector_sub(quantized_max, quantized_min); | ||
union MaskVector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll create a task for myself to add and/or/xor support to masks in RTM so we can clean this up later
const float max_value_ = rtm::scalar_safe_to_float(1 << num_bits); | ||
limit = rtm::vector_set(max_value_ - 1.0F); | ||
mid_compress = rtm::vector_set(0.5F * max_value_); | ||
mid_decompress = rtm::vector_set((0.5F * max_value_) - 0.5F); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: for 8 bits, limit = 255, mid_compress = 128, mid_decompress = 127.5
ACL_ASSERT(num_bits < 31, "Attempting to decay on too many bits"); | ||
|
||
const float max_value_ = rtm::scalar_safe_to_float((1 << num_bits) - 1); | ||
|
||
#endif | ||
|
||
max_value = rtm::vector_set(max_value_); | ||
inv_max_value = rtm::vector_set(1.0F / max_value_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: for 8 bits, max_value = 256, inv_max_value = 1/256
|
||
ACL_ASSERT(vector_all_greater_equal(value, rtm::vector_set(-0.5F)) && vector_all_less_equal(value, rtm::vector_set(0.5F)), "Expected normalized signed input value: %f, %f, %f, %f", (float)vector_get_x(value), (float)vector_get_y(value), (float)vector_get_z(value), (float)vector_get_w(value)); | ||
|
||
const vector4f packed_value = vector_min(vector_add(vector_floor(vector_mul(value, scales.max_value)), scales.mid_compress), scales.limit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: If our input value is -0.5 and we use 8 bits, we have:
packed_value = min(floor(-0.5 * 256) + 128, 255)
packed_value = min(floor(-128) + 128, 255)
packed_value = min(0, 255)
If our input is 0.5, we have:
packed_value = min(floor(0.5 * 256) + 128, 255)
packed_value = min(floor(128) + 128, 255)
packed_value = min(256, 255)
If our input is [0.496, 0.5], we have:
packed_value = min(floor(0.496 * 256) + 128, 255)
packed_value = min(floor(127) + 128, 255)
packed_value = min(255, 255)
It works similarly for [-0.5, -0.496]
ACL_ASSERT(vector_all_greater_equal(value, rtm::vector_set(-0.5F)) && vector_all_less_equal(value, rtm::vector_set(0.5F)), "Expected normalized signed input value: %f, %f, %f, %f", (float)vector_get_x(value), (float)vector_get_y(value), (float)vector_get_z(value), (float)vector_get_w(value)); | ||
|
||
const vector4f packed_value = vector_min(vector_add(vector_floor(vector_mul(value, scales.max_value)), scales.mid_compress), scales.limit); | ||
const vector4f decayed_value = vector_mul(vector_sub(packed_value, scales.mid_decompress), scales.inv_max_value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: for packed_value = 0, we have:
decayed_value = (0 - 127.5) / 256 = -0.498
For packed_value = 255:
decayed_value = (255 - 127.5) / 256 = 0.498
|
||
#ifdef ACL_PRECISION_BOOST | ||
|
||
rotation_as_vec = _mm_mul_ps(_mm_sub_ps(xyzf, _mm_set_ps1(32767.5F)), _mm_set_ps1(1.0F / 65535.0F)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 8 bit quantization we use 1/254, but for 16 bit we use 1/65535? Why the off by 1 for 8 bits but not 16?
const float max_value = rtm::scalar_safe_to_float(num_values); | ||
const float inv_max_value = 1.0F / max_value; | ||
const float mid_value = (0.5F * max_value) - 0.5F; | ||
return (input - mid_value) * inv_max_value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here for 8 bits we'd use 256 and for 16 bits 65536 which differs from the code earlier. Why?
|
||
#ifdef ACL_PRECISION_BOOST | ||
|
||
: max_value(1.0F / float(1 << num_bits_)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly we'd use 256 and 65536 for 8/16 bits respectively which differs.
|
||
#ifdef ACL_PRECISION_BOOST | ||
|
||
return _mm_mul_ps(_mm_sub_ps(value, _mm_set_ps1(32767.5F)), _mm_set_ps1(1.0F / 65535.0F)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here using 65535 for 16 bits
|
||
#ifdef ACL_PRECISION_BOOST | ||
|
||
return _mm_mul_ps(value, _mm_set_ps1(1.0F / 254.0F)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here 254 for 8 bits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like great stuff!
I'm just unclear as to why sometimes we use 256, 255, and 254 sometimes to decompress a value. Can you elaborate a bit?
Thanks!
Yeah, I can clear that up. Most quantization follows the (start, end) -> (-0.5 + max_error, 0.5 - max_error) style. I removed quantization that ACL doesn't use as an added bonus, and to conserve effort. The vector3_u48 family of functions is used to compress constant channels within segments. I opted to preserve the (start, end) -> (-0.5, 0.5) style there, which is what the precise_endpoints naming convention signifies. Now that I think about it, perhaps the new style would work better. I was more fixated on the bone ranges. The vector3_u24 family of functions compresses bone ranges within segments. This is the trickiest part. I wanted to preserve endpoints as well as I could, knowing that the quantization within segments introduces error there. I also wanted to compress the midpoint of ranges instead of their mins, to take maximum advantage of the signed normalized range(-0.5, 0.5). In order for this to work for any subrange, the requirements get a bit stricter. (start, middle, end) -> (-0.5, 0.0, 0.5) or (0.0, 0.5, 1.0). I sacrificed a value in the bitmask to achieve this. A similar result would be achieved by using a sign bit instead, but IMO this is simpler. Naming convention is precise_endpoints_midpoint. Stepping through its unit test should help explain the process. In any case, it didn't work as well as I'd hoped, so if you take a closer look, hopefully you find a bug. FWIW I considered eliminating segment-level quantization(raw constants and ranges) to see what would happen, but the toughest obstacle was the assumption that segment constants and segment ranges are always the same size. |
Awesome, thanks! |
Feel free to link this failed attempt to #352
I doubt it can compete with status quo unless clip ranges are manipulated further, but once you start doing that, you're just introducing floating point error somewhere else.