Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IETF-readiness] Add Prior Art and Translation section, update deprecation FAQ entry #164

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

bumblefudge
Copy link

As part of the ongoing work of getting Multiformats ready for another attempt at IETF, I wanted to add some "prior art and translation" sections, to show how a CID-powered system could, say, turn CIDs into DataURIs (tracked in a separate issue, how a binary CID can be wrapped in a little CBOR outer wrapper for mixed-tooling systems, etc etc. This is primarily to address what one interlocutor at IETF 118 called the "why bother reinventing wheels" question which multiformats needs to answer in its next charter and draft specs, but also to situate it as a useful and complementary (rather than just redundant) member in the family of IETF specifications and standards.

This specific translation section seemed like the most urgent to do first, given the history and the layering (the section about "unchunked" CIDs is probably the one that needs the most massaging, honestly). I am endebted to @gobengo from web3.storage for prototyping the conversions in his great blog post, "the Secret of NIMHs". Speaking of prototypes, if there is interest I could theoretically spin up a little NPM repo that spits out ni://... and ni://mh; URLs for test vectors already in this repo or others, if it were needed, and/or just add an e2e example for each of the two forks in the algorithm described here for the conversion.

I'm test-ballooning here in multiformats/multibase a section that I will add to the IETF draft of multibase before re-applying, if it gets merged here and makes sense. Next up: dataURIs...

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Copy link
Member

@rvagg rvagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine aside from the minor link formatting issue and a quibble about ULEB128 which is material and probably shouldn't be swept aside

Copy link
Member

@vmx vmx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found only a minor typo. I agree with @rvagg that the minimal encoding of the ULEB128 should be mentioned as that's one of those things that mentions for content-addressed systems.

README.md Outdated Show resolved Hide resolved
Comment on lines +168 to +169
1. (for binary form:) prefix existing binary multihash with `0x42` to designate that what follows is a multicodec prefix followed by an ULEB128 hash value.
2. (for ASCII form:) convert the `0x42` prefix to URL format, i.e., `ni:///mh;` and then append a base64url, no-padding encoding of the entire binary multihash with prefix (and _without_ adding the additional base-64-url-no-padding prefix, `u`, if using a [multibase][] library for this base-encoding).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a spec proposal? Doesn't seem to be anywhere else and seems to effectively be a separate spec.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that if we merge this to then go to the nih registry and request 0x42?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: which number we use doesn't bother me but we used 0x42 for the CID tag in dag-cbor. I don't see NIH wanting CIDs more than multihashes so probably fine, but wanted to flag so it's documented here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that if we merge this to then go to the nih registry and request 0x42?

that's the beauty of it:
https://www.ietf.org/archive/id/draft-multiformats-multihash-07.html#name-the-mh-digest-algorithm

(I assumed the 42 was an intentional nod to that other iana registration!)

README.md Outdated Show resolved Hide resolved

Cannot find a good standard on this. Found some _different_ IANA ones:
In IETF's corpus of normative protocols, there are two partial overlaps worth knowing about to ensure a safe implementation:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to ensure a safe implementation

What does this mean?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meaning, if you're really brownfield or ingesting unknown data and you get something that isn't a multiformat, here are some other prefixes you might want to sniff for as fallback, that might have been put there by other IETF self-description conventions 😄 . any wordsmithing help appreciated!

Comment on lines +168 to +169
1. (for binary form:) prefix existing binary multihash with `0x42` to designate that what follows is a multicodec prefix followed by an ULEB128 hash value.
2. (for ASCII form:) convert the `0x42` prefix to URL format, i.e., `ni:///mh;` and then append a base64url, no-padding encoding of the entire binary multihash with prefix (and _without_ adding the additional base-64-url-no-padding prefix, `u`, if using a [multibase][] library for this base-encoding).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that if we merge this to then go to the nih registry and request 0x42?

Copy link

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bumblefudge @rvagg I switched to Approve given that my main issues have been resolved and I don't want to block merging if it's important.

However, I'd still like some clarity on the ones I have open. In particular, this PR adds a new spec (the NIH translation) into the doc. It seems fine, but also requires grabbing a number in the NIH registry which I don't know much about (are they likely to even let us reserve it)? Want to at least understand the state of things in the PR, even if we need to do a dance of merge, submit to NIH, make another PR that states 0x42 is provisionary and to look at the open request for the NIH registry.

@bumblefudge
Copy link
Author

bumblefudge commented Aug 15, 2024

@aschmahmann i checked before opening the PR-- 42 wasn't officially registered but the provisional registration is already in datatracker and there doesn't seem to be much activity in that IANA registry anyways. I was thinking I would include the translation section in the next IETF internet-draft version of multihash, and follow up with IANA officially on the basis of that i-d, but I could do it on the already submitted one in the meantime?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants