Skip to content
This repository has been archived by the owner on Jul 14, 2021. It is now read-only.

Allow overriding the type unpacked from MSGPACK_OBJECT_RAW #6

Closed
bpot opened this issue Sep 10, 2010 · 12 comments
Closed

Allow overriding the type unpacked from MSGPACK_OBJECT_RAW #6

bpot opened this issue Sep 10, 2010 · 12 comments

Comments

@bpot
Copy link
Contributor

bpot commented Sep 10, 2010

Having String as the unpacking type for MSGPACK_OBJECT_RAW seems to make it impossible to correctly pack and unpack binary data. Probably because of the coercion that v8 does to turn binary string into utf8/ascii. Switching it to return a Buffer object (http://gist.github.com/573317) fixed the issue.

If you run this script on a binary file you can see how it fails: http://gist.github.com/573312

@pgriess
Copy link
Owner

pgriess commented Sep 10, 2010

Yeah, this behavior kind of sucks. Unfortunately, fixing this is a thornier problem than it might initially seem.

The root of the problem is that the MsgPack serialization format does not differentiate between strings and blobs of raw bytes (all are serialized as MSGPACK_OBJECT_RAW), so we have to pick one format to de-serialize to. Strings are far more common than Buffer objects, IMO. Particularly when inter-operating with other MsgPack-RPC clients.

Anyway, I don't think I'm willing to change the default unpacking type as you suggest, but I can provide a way to override this if needed. Does that work for you?

@bpot
Copy link
Contributor Author

bpot commented Sep 10, 2010

Yeah, I do see how it's kind of a difficult issue. I think being able to override the default unpacking type makes a lot of sense.

@tracker1
Copy link

Why not just put the UTF-8 encoded byte order marker when you are passing a string in... then when decoding, you look for that marker and decode those raw entries with the BOM to String, otherwise Buffer.

@OrangeDog
Copy link

While Strings may be more common than Buffers, if you unpack a String as a Buffer you can easily make String from it, but you cannot recreate the correct binary data if it's been unpacked to a String.

Therefore, there really really should be at least the option to decode MSGPACK_OBJ_RAW as a Buffer. Otherwise it becomes 100% impossible to use this library to decode raw binary data correctly.

@OrangeDog
Copy link

Either like 64c567a or 1c7fe3f.
The latter is my preferred choice.

@jacwright
Copy link

Personally I think strings should be the default since msgpack is designed as a faster replacement of JSON. However, going with the latter and allowing the developer to choose to work with buffers instead I think is fine.

@chakrit
Copy link

chakrit commented Aug 14, 2013

This has been fixed in the new v5 spec, please see here: https://gist.github.com/frsyuki/5432559

@chakrit
Copy link

chakrit commented Aug 14, 2013

I have also filed a separate issue: #49

@godsflaw
Copy link
Collaborator

I'm working on another project for a few weeks, but might be able to hack in support for this after that. pull requests from the impatient and motivated are welcome.

@jhiver
Copy link

jhiver commented Nov 11, 2013

Sorry if this seems silly, but would it be thinkable to just have a different type for buffers, for instance MSGPACK_OBJECT_BUFFER? This way you could serialize / deserialize it properly...

@OrangeDog
Copy link

This way you could serialize / deserialize it properly...

Not if you're trying to communicate with a different implementation.

@jacwright
Copy link

The msgpack specification now supports the differentiation between strings and buffers. It's been implemented in https://npmjs.org/package/msgpack-js-v5 (probably to avoid breaking libraries using this module).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants