Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use float32 instead of int64 for bucket-keys? #14

Open
speezepearson opened this issue May 20, 2020 · 1 comment
Open

Use float32 instead of int64 for bucket-keys? #14

speezepearson opened this issue May 20, 2020 · 1 comment

Comments

@speezepearson
Copy link

speezepearson commented May 20, 2020

Currently, doubles are "packed" into int64s by doing 3 things:

  • typecast the double's bytes to an int64
  • mask out most of the mantissa
  • bitshifting away all of the 0s that we just masked out

This turns the double into a fairly small int64 (IIUC, we care about "small" because it takes advantage of Protobuf's varint encoding).

Here's another approach, which might be preferable:

  • convert the double to a float
  • mask out most of the mantissa

The main differences between the two approaches are:

  • Range of values expressible. float has only 8 exponent bits, not 11, so it can only represent values in the range 2^(±128) instead of 2^(±1024). This seems inconsequential.
  • Size of representation. floats take 4 bytes; varints take less, usually, but in practice, all reasonably-sized doubles (other than exactly 0) pack to ints between 2^14 and 2^21, and therefore take 3 bytes (demo). So using floats instead would only make most bucket keys 33% larger.
  • Simplicity. Storing data in its natural datatype is good. The type makes it clear what's being stored, and the strings produced by languages' built-in stringification methods aren't opaque.
@speezepearson
Copy link
Author

(spawned by discussion with @orborde)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant