Skip to content
This repository has been archived by the owner on May 6, 2024. It is now read-only.

add type ByteString for backwards compat with IPLD schema #16

Open
petar opened this issue Mar 11, 2022 · 5 comments
Open

add type ByteString for backwards compat with IPLD schema #16

petar opened this issue Mar 11, 2022 · 5 comments
Labels
good first issue Good for newcomers

Comments

@petar
Copy link
Collaborator

petar commented Mar 11, 2022

Edelweiss type String definitionally holds valid Unicode strings only, and encodes/decodes them from IPLD strings with valid UTF8 encodings.

However, there may be pre-existing IPLD schemas that place non-UTF8 byte sequences in IPLD string objects.
By design requirement, Edelweiss must provide a way for working with pre-existing schemas.

One way of doing this without violating Edelweiss's type semantics is to introduce a new Edelweiss type, say called ByteString, which:

  • is a list of bytes on the user-facing end
  • encodes/decodes as an IPLD string of arbitrary bytes on the wire
@petar petar added the good first issue Good for newcomers label Mar 11, 2022
@petar petar added this to the Milestone 1 (MVP) milestone Mar 11, 2022
@vmx
Copy link
Member

vmx commented Mar 11, 2022

Is this a general IPLD Schema problem (then it should really be fixed) or an Go Schema implementation detail?

@petar
Copy link
Collaborator Author

petar commented Mar 21, 2022

Is this a general IPLD Schema problem (then it should really be fixed) or an Go Schema implementation detail?

@vmx I've updated the description. Maybe it was a bit confusing previously.

by the way, there is a new set of slides that documents Edelweiss at its current state (Milestone 1): https://github.com/ipld/edelweiss/tree/main/doc/slides
this may be helpful too.

@vmx
Copy link
Member

vmx commented Mar 21, 2022

  • encodes/decodes as an IPLD string of arbitrary bytes on the wire

I guess one major serialization will be CBOR. Then this won't work. In CBOR strings need to be valid UTF-8, else it's invalid, non-spec compliant CBOR.

@petar
Copy link
Collaborator Author

petar commented Mar 21, 2022

If this is the case, this suggests a design bug in the IPLD data model: if the IPLD data model allows arbitrary bytes in a string (which I believe it does), then this breaks the contract that IPLD values can be serialized to any backend (e.g. both DAGJSON and DAGCBOR).

@vmx
Copy link
Member

vmx commented Mar 21, 2022

The IPLD data model is independent of the serialization, so potentially there could be serializations that support that, we currently just don't have any of those serialization formats.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants