Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added CDDL as data definition syntax for parts of the Abstract Data Model of DID Document and registry items #138

Merged
merged 7 commits into from
Dec 2, 2020

Conversation

jonnycrunch
Copy link
Contributor

@jonnycrunch jonnycrunch commented Sep 29, 2020

This PR describes vanilla CBOR and succinct data definition for the entire DID document and associated DID-Spec-registries. All examples in the registry properly validate for in both cbor and json and can be found here. The main goal for this convention is to provide a unified notation for defining the DID Document and registry items in an Abstract Data Model regardless of core representation format(JSON vs CBOR). While not an explicit goal per se, but a convenient side effect of the JSON generic data model being a subset of the CBOR generic data model, is the fact that CDDL can also be used for describing JSON representation of the DID Document and associated properties in the registry. So, it just works. In the future, this CDDL can be expanded to support CBOR-LD and other tags.


Preview | Diff

@@ -2902,12 +2902,24 @@ <h2 id="x6-did-methods"><bdi class="secno">6. </bdi>DID Methods<a class="self-li



<section id="references" class="appendix"><h2 id="a-references"><bdi class="secno">A. </bdi>References<a class="self-link" aria-label="§" href="#references"></a></h2><section id="normative-references">
<section id="references" class="appendix"><h2 id="a-references"><bdi class="secno">A. </bdi>References<a class="self-link" aria-label="§" href="#references"></a></h2>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2020-06-18 fpwd is a frozen document, this file shouldn't be changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonnycrunch suggest you resolve this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I update the correct version and undid the changes to head index, but please advise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only file in "files changed" should be index.html

https://github.com/w3c/did-spec-registries/pull/138/files

It appear you are still committing changes to FPWD.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, hard reset and hopefully resolved my error. Thanks!

@peacekeeper
Copy link
Contributor

This looks interesting, but calling CDDL a "data definition syntax for the Abstract Data Model" may go to far. We had agreed that we would use INFRA for this (plus adding missing data types such as Number or DateTime).

Each property in the registry should contain representation-specific data if such data needed by a given representation. For JSON-LD, each property in the registry contains a link to a JSON-LD context where the property is defined. For CBOR, each property in the registry should contain whatever CBOR-specific data is needed (if any) to describe the property (a semantic tag? not sure).

@iherman
Copy link
Member

iherman commented Sep 29, 2020

Note that the table, as it stands today, is unbalanced. It contains

  1. a reference to the JSON-LD @context which is, largely, 'just' a mapping of the specific term to a precise URL; ie, it does not define the term, just refers to it;
  2. a CDDL snippet that defines constraints on the way the term should be used

@iherman
Copy link
Member

iherman commented Sep 29, 2020

This looks interesting, but calling CDDL a "data definition syntax for the Abstract Data Model" may go to far. We had agreed that we would use INFRA for this (plus adding missing data types such as Number or DateTime).

The term used may not be the right one, but I look at these CDDL snippets as complementary to INFRA. INFRA does not define constraints on, say, the content of a map, CDDL does that. I.e., I do not think there is a contradiction here.

What CDDL does is more or less equivalent to what I referred to in w3c/did-core#404, and subsequently did in the SHACL definition: provide a formal (and also machine-readable) specification of the complete core vocabulary, expressing all the constraints in a precise manner. Except that what @jonnycrunch did is much more detailed than the SHACL and, more importantly, the CDDL can be used to define the constraints on the Abstract Data Model (rather than on an RDF graph only, which is what SHACL is made for), making it much more useful. Big +1 to CDDL.

Actually, I think that we should exploit this further. In my view, the CDDL definitions should be part of the DID Core spec in some way or other, as a normative way of defining the DID Core vocabulary. It is not the role of the Registries document to provide this definition. (There is no problem referring to the CDDL snippets from the registry tables, of course, but that is an informative, i.e., non-normative reference only. Note my separate #138 (comment), though)

Copy link
Contributor

@OR13 OR13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert changes to FPWD.

@OR13
Copy link
Contributor

OR13 commented Sep 29, 2020

I'd prefer to see the CDDL bits in DID Core, and I agree with @iherman I think CDDL can help make infra more useful, but this isn't the best place to do it... the ADM needs to be fixed in DID Core, not here.

@jonnycrunch
Copy link
Contributor Author

@OR13 🙏 for FPWD changes. I can write CDDL, CBOR and regular expression but still can't figure out reg-spec stuff.

@OR13
Copy link
Contributor

OR13 commented Sep 29, 2020

@jonnycrunch its ok, neither can we, echidna has been broken for months.

@peacekeeper
Copy link
Contributor

Thanks @iherman for the clarification, I may have misunderstood this. If CDDL is used to describe the ADM in a way that is complementary to INFRA, then I agree having a machine-readable definition of the ADM could be useful, and I think I also agree with @OR13 that this should go into DID Core.

I think what confused me is that these CDDL definitions don't just include data structures, but also data types, e.g. I found the following in updated.cddl:

did-document-metadata = {
    "updated" : text .regexp "^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z$"
}

This doesn't just say that the did-document-metadata contains an updated property, it also defines the lexical format of its value, which is something that may differ depending on the representation of the DID document.

As another example, the controller.cddl defines a regex for a DID. Which one would be authoritative, the CDDL or the ABNF in DID Core?

@jonnycrunch
Copy link
Contributor Author

@peacekeeper I think the reason I like CDDL is that it is a nice mix of Type and Class declaration and lexical constraints on the data definition of the possible values, either explicitly as in the regular expression or by referencing by Types as I have attempted to do in some of the definitions. This is my first stab at it, so I appreciate any feedback and help updating the declaration and improving the constraints. For instance, should we be explicit about the length of the x value in publicKeyJwk as 32 bytes? Or, should the value of publicKeyHex start with 0x? BTW, CDDL does have Normative rules for mapping to ABNF, so it is sufficiently expressive and NOT overlapping rules, but I see it as complimentary. Perhaps I should have made that clear in the Spec?

Personally, I don't understand INFRA well enough to understand how it maps to ABNF. The question is how to best leverage the best of INFRA, CDDL and ABNF such that it the DID spec is unambiguous and we can move on with our lives.

@iherman
Copy link
Member

iherman commented Sep 30, 2020

This doesn't just say that the did-document-metadata contains an updated property, it also defines the lexical format of its value, which is something that may differ depending on the representation of the DID document.

If I look at CDDL as a constraint definition on the INFRA model then I do not think we should have a problem. We do have literals in the ADM and this definition "just" defines how that literal is constrained for us. That should be independent of the concrete syntactic representation.

As another example, the controller.cddl defines a regex for a DID. Which one would be authoritative, the CDDL or the ABNF in DID Core?

In an ideal world both of them should:-) In other words, they should be in sync: there is an English prose and there should be a CDDL definition in the spec, and if there is a contradiction then that is a bug in the spec. I am not sure how to handle that editorially, though.

@OR13
Copy link
Contributor

OR13 commented Oct 6, 2020

I am in favor of merging this, once the erroneous changes to FPWD have been removed.

Copy link
Contributor

@OR13 OR13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of taking these changes.

@OR13
Copy link
Contributor

OR13 commented Oct 13, 2020

ping @msporny

@jonnycrunch
Copy link
Contributor Author

some of the CDDL examples could be improved. And as a group will need to make additional constraints. I welcome feedback from real CDDL and CBOR experts such as @cabo and @anweiss. 🙏

@cabo
Copy link

cabo commented Oct 26, 2020

Which examples should we be looking at?

@jonnycrunch
Copy link
Contributor Author

@cabo here is my attempt at modeling the entire spec and providing examples. Thanks!🙏

@cabo
Copy link

cabo commented Oct 27, 2020

How do I use this? There are 1286 lines of CDDL, mostly identical copies. Is there a "source" from which these have been compiled together?

@jonnycrunch
Copy link
Contributor Author

@cabo the did-document.cddl file is the composite document. My task was to provide examples for all properties, which is why it was broken up. Thanks!

@cabo
Copy link

cabo commented Oct 27, 2020

The regexps are a nice demonstration why we really need to move forward on the ABNF integration into CDDL...
You are using PCRE-style regexps with anchors; these are not needed for the W3C regexps employed by .regexp (and you were missing the trailing one in did-url if you want to use .pcre).
You define, but do not use method-name and publicKeyGpg.
You use, but do not define publicKeyPem and publicKeyMultiformat (I turned these into sockets $publicKeyPem and $publicKeyMultiformat for playing with the CDDL).

@anweiss
Copy link

anweiss commented Oct 28, 2020

Hey @jonnycrunch not sure if this was intentional, but it also looks like assertionMethod_example7.json doesn't validate against assertionMethod.cddl due to the fact that the id entry in the publicKeyBase58 struct is of type did-url and not relative-did-url. The JSON shows "id": "did:example:123#z6MkpzW2izkFjNwMBwwvKqmELaQcH8t54QL5xmBdJg9Xh1y4" which looks to be a relative-did-url.

@anweiss
Copy link

anweiss commented Oct 29, 2020

@jonnycrunch disregard previous comment ... error on my end ... regex is correct

some findings ... will update this comment as issues are discovered:

  • capabilityDelegation.cddl has a typo on Line 3 ... should be capabilityDelegation and not capabilityDelgation

@jonnycrunch
Copy link
Contributor Author

I updated the PR to include minor typos and unnecessary quotes and ^/$ from regex as per recommendation. many thanks to @cabo and @anweiss for their input.

The updated complete files included all examples and a composite did-document.cddl can be found here.

Please note that all examples ( in both JSON and CBOR) pass as valid using the described CDDL. This is because for the most part I used preludes that overlap with JSON ( aka vanilla CBOR ). More advanced scenarios are for future enhancements.

I want to make particular attention that NONE of the CDDL that I had submitted or am submitting to this registry as part of my PR use the much debated @context property in the data definition. The did-document.cddl is here by reference and is not being submitted to the registry, however, I welcome feedback to this composite approach for the entirety of the DID spec.

@jonnycrunch
Copy link
Contributor Author

update, I guess I did include it in the @context.cddl. So, maybe for the most part this is true.

Also for completeness, here is the list of JSON-Compatible Subset of CDDL preludes which makes this all work with vanilla cbor.

@OR13
Copy link
Contributor

OR13 commented Nov 5, 2020

@peacekeeper can we merge this now, and wait till did core is updated to change it?

@peacekeeper
Copy link
Contributor

@OR13 and @jonnycrunch based on the decisions during yesterday's DID WG TPAC meeting, I think it would be ideal if we could have two versions of the composite did-document.cddl:

  • One version (called did-document.cddl) that includes only representation-independent properties (i.e. without @context).
  • One version (called did-document.jsonld.cddl or similar) that includes the representation-specific properties for the JSON-LD representation (with @context required rather than optional).

I'm no CDDL expert, but I wonder if it would be possible for the latter to somehow "inherit" from the former?

@cabo
Copy link

cabo commented Nov 6, 2020

CDDL does not have a module system yet; that is on the agenda of the IETF CBOR WG.
Maybe having two specs is not needed; have a look at how we generate two json-schema-org style specs from one CDDL spec in SDF: https://github.com/ietf-wg-asdf/SDF/blob/master/sdf-feature.cddl

@OR13
Copy link
Contributor

OR13 commented Nov 6, 2020

@cabo @jonnycrunch @peacekeeper interesting link, thanks!

I think we can probably accomplish the same thing with one cddl file, and good comments.

In the interested of accepting good faith efforts to improve this repo, I am strongly urging us to accept this PR and refine it in subsequent PRs, with the understanding that anything in git can be reverted based on WG consensus.

@peacekeeper your request for changes is currently blocking this PR.

@jonnycrunch
Copy link
Contributor Author

@OR13 yep, I was already impressed by what CDDL can do for us. Many thanks to @cabo for your fine work! I'm even more convinced that this will help us have not only a solid foundation to facilitate a clear and unambiguous machine readable data definition of the DID spec, but one that is extensible, sound and robust!

@peacekeeper
Copy link
Contributor

peacekeeper commented Nov 6, 2020

How about simply adding a few comments to the CDDL for now, e.g. like this:

did-document = {
    ; representation-specific properties for JSON-LD
    ? @context : "https://www.w3.org/ns/did/v1" / ["https://www.w3.org/ns/did/v1", 1*~uri ]     ; This MUST be present in JSON-LD.

    ; representation-independent properties
    id : did 
    controller :  did / [1*did]
    ? verificationMethod : [ 1* verificationMethod ]  ; A public key can be used as a verification method.
    ? publicKey : [ 1* verificationMethod ]           ; A public key can be used as a verification method.
    ? authentication : [ 1* verificationMethod ]      ; The associated value MUST be an ordered set of one or more verification methods. Each verification method MAY be embedded or referenced.
    ? assertionMethod : [1* verificationMethod]       ; The associated value MUST be an ordered set of one or more verification methods. Each verification method MAY be embedded or referenced.
    ? keyAgreement : [1* verificationMethod]          ; The associated value MUST be an ordered set of one or more verification methods. Each verification method MAY be embedded or referenced.
    ? capabilityDelegation : [1* verificationMethod]  ; The associated value MUST be an ordered set of one or more verification methods. Each verification method MAY be embedded or referenced.
    ? capabilityInvocation : [1* verificationMethod]  ; The associated value MUST be an ordered set of one or more verification methods. Each verification method MAY be embedded or referenced. 
    ? service : [1* serviceEndpoint ]                 ; The optional service parameter describes one or more service endpoints. 
 }

(I.e. update this CDDL in IPFS with the above content, then update this PR with the new IPFS link).

I'm also 100% convinced by now that CDDL is a great addition and I agree to merge if we can make the above change. I hope we can then further formalize the difference between @context and the representation-independent properties in subsequent PRs.

@jonnycrunch
Copy link
Contributor Author

@peacekeeper done!

And here is the composite did-document.cddl and other files including all examples. I may update this with a script to automate testing with additional examples.

Please note that the did-document.cddl with the much debated @context isn't part of this PR, only the individual components. But I welcome feedback and support incorporating it here and in DID-core.

@msporny
Copy link
Member

msporny commented Nov 8, 2020

with the understanding that anything in git can be reverted based on WG consensus.

While that's true at it's surface, in practice, this mode of thinking has resulted in numerous problems in the WG.

The core of the problem is this: Things that go in without having consensus run the risk of staying in because there is no consensus to remove the thing that never had consensus to begin with when it went in the first time.

There is circular reasoning here that the group has gotten itself into that is problematic. Remember that the entire problem with the ADM and the producer/consumer language was that spec text got in there that didn't have consensus under a "we can always remove it later if there is consensus" and that sent us down a multi-month path of "but the spec says X, and now you're trying to change it!".

This is especially problematic because CDDL applied to all representations will lead to CBOR-LD being illegal. CBOR-LD does not encode its values per the rules in the CDDL files.

This PR needs to make it abundantly clear that the CDDL only applies to specific MIME types (and it needs to call those MIME types out by name). It is probably safe for the CDDL to apply to "application/did+json", "application/did+ld+json", and "application/did+cbor". It is definitely not acceptable for the CDDL to apply to "application/did+ld+cbor".

Copy link
Member

@msporny msporny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes requested:

  • Use static links to CDDL files (with IPFS links as alternates).
  • Specify CDDL per representation MIME type.
  • Minor editorial fixes.

Explanation needed on how DID Method CDDL is going to be kept up to date.

index.html Outdated
Comment on lines 78 to 85
}]
}, {
name: "Jonathan Holt, DO, MS",
url: "https://www.linkedin.com/in/jonathan-holt-do-ms/",
company: "TranSendX",
companyURL: "https://transendx.com",
w3cid: 115670
}
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Chairs have requested that PR authors do not modify the Editor/Author lists until later in the process. Please remove this change set.

/cc @brentzundel @burnburn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, however given the substantive contribution regarding the CDDL and the work that I have put in, seems only fair to consider me an author.

Comment on lines +209 to +210
Properties definitions in the DID Core Registries MUST include a Concise Data Definition Language (CDDL) [[RFC8610]] of the
Property and it's Abstract Data model structure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to make it clear that CDDL definitions must be done on a per-representation/per-MIME type basis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see issue to track this conversation: #153

index.html Outdated
<p>
Concise Data Defition Lanuage (CDDL) provides a succint data defition for representing the Abstract Data Model of
the <a href="https://www.w3.org/TR/did-core/#core-representations" taret="_blank">DID Core Specification</a> and associated <a href="https://w3c.github.io/did-spec-registries/" target="_blank">DID Spec Registeries</a>. The CDDL definition for the DID Document Specication and associated
registeries can be found <a href="https://ipfs.io/ipfs/QmNZvYAYqSgUuRwv6AhE3rMDE1nzvTfsq7xrVzXqM2x4Jw" target="_blank">here</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if we checked the representations into the repo so that they live with the DID Spec Registries. We can also point to IPFS locations. The concern with the IPFS locations is that the information may disappear if people stop replicating it. The archival policy at W3C is such that other institutions will take over if W3C fails (e.g., Library of Congress)... none of backup archival providers are staking IPFS hashes right now AFAIK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, feel free to put CDDL in the github repo and link directly to it in github for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

index.html Outdated
<tbody>
<tr>
<td>
<a href="https://w3id.org/security/#publicKeyGpg">security-vocab</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not exist in the security vocabulary. It needs to be added before merging this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonnycrunch just remove this for now, I have opened w3c-ccg/security-vocab#74

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

publicKeyGpg removed but left template for completion.

index.html Outdated
@@ -1634,7 +1714,7 @@ <h1>DID Methods</h1>

<p>
This table summarizes the DID method specifications currently in development.
The links will be updated as subsequent Implementer’s Drafts are produced.
The links will be updated as subsequent Implementer’s Drafts are produced. <a href="https://ipfs.io/ipfs/QmZ5eDK3k12hYWd2EPKJNkPMLokZ8gXuPASUA7D4B3YVy2" target="_blank">CDDL definitiion of DID method names</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is out of date... what's the plan to keep these CDDL files in sync w/ the DID Spec registries (if we don't autogenerate some of the CDDL, it will be wrong; for example, the list of DID Methods is changing on a fairly regular basis.

@jonnycrunch
Copy link
Contributor Author

This is especially problematic because CDDL applied to all representations will lead to CBOR-LD being illegal. CBOR-LD does not encode its values per the rules in the CDDL files.

This PR needs to make it abundantly clear that the CDDL only applies to specific MIME types (and it needs to call those MIME types out by name). It is probably safe for the CDDL to apply to "application/did+json", "application/did+ld+json", and "application/did+cbor". It is definitely not acceptable for the CDDL to apply to "application/did+ld+cbor".

@msporny to be clear, CBOR-LD is not a real thing yet. That said, I am extremely confident that we can cover that use-case with additional descriptive language utilizing CDDL and I am actively working on that as well as a JSON-LD constraints.

I am also confident that we can provide deterministic consumption rules into and out of other representations with CDDL as the description language to represent the ADM and in a way that supports CBOR-LD (at least my primitive understanding of your aims).

@cabo
Copy link

cabo commented Nov 9, 2020 via email

@OR13
Copy link
Contributor

OR13 commented Nov 19, 2020

@peacekeeper @msporny how can we get this through or close it? lots of changes here... we need to get this in or shut it down with clear feedback.

Copy link
Member

@msporny msporny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a massive change set, the Editors are going to have to fix a number of issues that remain not addressed. Let's pull this in and get to work cleaning it up.

@jonnycrunch the expectation here is that you will be maintaining all of this content (with the aid of the Editors). If these files drop out of maintenance, the expectation is that they will be removed.

Approved, with the caveats above.

Copy link
Contributor

@peacekeeper peacekeeper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with merging, I think this useful and it's definitely great work by @jonnycrunch !

But as a next step, we still need to account for the fact that the CDDL is representation-specific, i.e. it should state which representations it applies to (and potentially create multiple CDDLs for current and future representations).

@OR13
Copy link
Contributor

OR13 commented Nov 30, 2020

This PR is blocked by the merge conflict.

@jonnycrunch
Copy link
Contributor Author

updated and added language for JSON/CBOR only application of CDDL

@OR13
Copy link
Contributor

OR13 commented Dec 2, 2020

I am merging this with the understanding that we may need to make some editorial updates / structure changes, especially as some of this becomes better defined in did core.

@OR13 OR13 merged commit af6d1e9 into w3c:master Dec 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants