-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Future specification on binary format #27
Comments
A similar question came up here: jetify-com/typeid-go#5 So definitely open to having the spec define a formal binary representation. Did you already have a particular binary representation in mind? |
I'm still experimenting with it. Currently, I have an 8-bit length indicator of the prefix followed by the raw ASCII of the prefix, then followed by the normal encoding of the UUID. It's not the most compact way of doing so, e.g. the length only needs 6 bits and each letter only 5 bits. I think I'm happy with what I'm doing now for my particular use case (since I don't need to squeeze every inch of space), but it may not be very suitable as a standard way defined in a spec. Another possibility is (if we use 5 bits to encode each letter) to stop encoding the length but fuse a separator indicator with the last letter, since normally there are 32 - 26 = 6 unused bits. |
For the spec I think we need to answer what we're trying to optimize for. Things running through my mind include:
Do you have any thoughts on these? |
Tagging people who have implemented typeid libraries in other languages: @cbuctok @sloanelybutsurely @fxlae @softprops @faustbrian @akhundMurad @broothie @conradludgate @johnnynotsolucky @Frizlab @ongteckwu @tensorush Do you have a need for a binary encoding specification? If so, what properties do you think are important for your use cases? |
For a binary encoding, I would expect to have an already typed binary schema. In that case, I'd personally use a UUID big endian 16 byte encoding rather than create anything bespoke. Since my binary schema would already be typed, I would forfeit the type prefix. For a nontyped binary format like cbor, I could imagine a custom encoding though. Cbor has no byte alignment properties so I would perhaps encode the prefix str and the 16 bytes as a cbor array |
I wonder if we're better off not defining a binary encoding as part of the spec and leaving it up to the use case. The examples @conradludgate gives make me think the ideal encoding is use-case dependent. If you can already guarantee the type in your binary format, you can completely elide the prefix, and re-introduce it when decoding the binary representation. If you want to encode the type, you might be better off using the representation suggested by the format you're using (i.e. |
I created the lib just "because I could" and am not using it, so I'd be happy with whatever binary encoding specs you guys will come up with 🙂 |
IMO, it would be better to define several possible encoding options for a variety of use cases. |
Is there any plan to add into the specification how to convert a typeid to binary format?
In my other personal project utilising typeid, I will need to serialise the ids. So far I'm implementing my own serialisation only for that specific project, but if there will be a formal specification, I can include that in the Haskell implementation as well.
The text was updated successfully, but these errors were encountered: