-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RDF Dataset Canonicalization support #71
Comments
Just FYI, I've built something a bit like this in https://docs.enola.dev/use/canonicalize/, into this, using an RFC 8785 JSON Canonicalization Scheme (JCS) -inspired (but currently not fully compliant) algorithm. |
@vorburger Please, out of curiosity, what’s the motivation for the canonicalization on top of the expanded JSON-LD form? I’m asking because I’m experimenting with something similar to explore whether there could be a simpler and safer way to canonicalize JSON-LD without needing to go down to the RDF level or staying too high with JCS. |
Let's take this example: {
"@context": "http://schema.org/",
"ref1": {
"@id": "http://example.com/doe",
"@type": "Person",
"name": "Jane Doe"
},
"ref2": {
"@id": "http://example.com/doe",
"jobTitle": "Professor"
}
} If you use plain JCS on top of the expanded form, it won't match this: {
"@context": "http://schema.org/",
"ref1": {
"@id": "http://example.com/doe",
"@type": "Person",
"name": "Jane Doe",
"jobTitle": "Professor"
},
"ref2": {
"@id": "http://example.com/doe"
}
} But from an RDF perspective, they are the same. RDFC will produce the same result for both examples. <http://example.com/doe> <http://schema.org/jobTitle> "Professor" .
<http://example.com/doe> <http://schema.org/name> "Jane Doe" .
<http://example.com/doe> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
_:c14n0 <http://schema.org/ref1> <http://example.com/doe> .
_:c14n0 <http://schema.org/ref2> <http://example.com/doe> . This makes me think there might be another JSON-LD form - something between expanded and flattened. One that extracts and merges identifiable nodes while keeping blank nodes embedded as they are, preserving the tree-like structure in both cases without needing to disintegrate the entire tree to the statement level. I wonder if anyone would be interested in something like that. |
Hi! To be totally honest, my initial motivation for "hacking" (my)
I only vaguely understand what you mean... but sounds interesting. But if I were you and wanted to pursue this, I would probably start a discussion about it... on https://github.com/w3c/rdf-canon/issues, might be a good place, according to https://w3c.github.io/rch-wg-charter/#communication?
I don't think I currently would have a need for it. PS: TBD just FYI enola-dev/enola#1103. |
Thank you for your answer. The motivation for canonicalization over JSON-LD, while maintaining the same level of granularity, could help mitigate key issues such as:
These issues can lead to potentially expensive computations and risks such as graph poisoning. The intermediate form I envision is somewhere between JCS and RDFC - faster and safer than RDFC while remaining at the semantic level like RDFC, rather than purely syntactic like JCS. It is specifically designed for JSON-LD and tree-like structures , which is another limitation, causing zero interest from the RDF community. From the feedback I’ve received, there seems to be interest in this approach (a canonical form is crucial for signing, verifiable credentials, etc.), but only if someone is willing to put in the effort to make it a standard. 😉 |
Released! Check out v0.10.0. |
e.g. a new command
canonicalize
taking RDF as an input and producing canonicalized RDF as an outputThe text was updated successfully, but these errors were encountered: