Replies: 3 comments
-
Realistically I wouldn't reach for Cap'n Proto as a way to solve performance problems in an existing system unless the use case specifically calls for zero-copy. For example, if you want to be able to mmap() a huge structure from disk and read only a small subset of it, that's where zero-copy will really shine. But if you are passing messages around for an RPC-ish use case, it won't necessarily help, and the work involved in changing serialization formats is probably quite large. |
Beta Was this translation helpful? Give feedback.
-
Thanks for chiming in, @kentonv . Yes, that's why this is a discussion and not an issue; it's a weekend thought experiment that I figured I would share, even if we aren't likely to pursue it seriously. |
Beta Was this translation helpful? Give feedback.
-
This is a discussion we've been having on and off for a while. More recently related to #1478 IMO the problem is not really what format/codec we use to serialize / deserialize capdata, but how it's represented in our system across boundaries. In particular, we often need to nest CapData inside other messages, which are themselves serialized / deserialized. So far our approach is that our messages don't have a specific / predefined schema, and that CapData does not have a special / recognizable type. As such the CapData needs to follow the serialization logic of wherever it gets nested, which for JSON means stringifying or parsing it eagerly. In agoric-sdk, there are really 2 places where CapData appears:
I don't believe true zero-copy is feasible in JavaScript. A byte representation will always need to be decoded into JS values, whether at the JS layer, or at the "network" layer. A JS Proxy wouldn't enable partial parsing because we aggressively harden objects. In the other direction, you would either need marshal to produce bytes and have the nesting layers pass through the bytes as-is, or have marshal avoid any serialization and rely on the network layer to serialize. Throughout the stack, we are often interested in partial parsing CapData across boundaries: leave the body fully or partially unparsed, but parse the slots, e.g. for clist rewrite. The supervisor level could have a predefined schema for all its messages, which would allow partial parsing of the CapData if needed. The problematic use case is as follow: agoric-sdk vstorage values are often encoded as CapData. These values are currently marshalled by the board vat, which receives a < insert graph here with payload for every step> As seen above, simply "handling raw deliveries" doesn't help by itself, since each delivery payload needs to be partially wrapped / unwrapped. I'm honestly unsure how to solve these layering constraints besides having a first class notion of "serializable data" (hardened JS value which can be JSON stringified, maybe extended to support binary blobs), that At that point we could imagine changing the vstorage API so that values could be serializable data instead of just strings (with the pending question of what to do with values already written that are raw string values instead of a JSON stringification). |
Beta Was this translation helpful? Give feedback.
-
In a rambling discussion with @erights , we were talking about medium/long term performance approaches. I brought up...
where we find
I asked @erights if he sees any path to zero-copy marshal, and he said perhaps if we used capnproto, combined with something like my protobuf schema for passable.
zenhack sketched ocapn.capnp; it's probably pretty close.
capnproto support for JavaScript doesn't get anywhere near the sort of optimized proxy @erights and I discussed; the library by @kentonv is SLOW (his words).
OCaml support looks quite good; the mirage folks seem to use it in anger. OCaml is a good match for the way I think.
But capnp-ocaml has very little in the way of "getting started" material.
I suspect nix could help put all the tools together, but it also adds complexity.
I'm kinda stuck. Or at least out of mental energy for now.
Beta Was this translation helpful? Give feedback.
All reactions