Add RDF Dataset Canonicalization support #71

filip26 · 2024-01-30T12:42:12Z

e.g. a new command canonicalize taking RDF as an input and producing canonicalized RDF as an output

The text was updated successfully, but these errors were encountered:

vorburger · 2024-07-22T19:38:30Z

Just FYI, I've built something a bit like this in https://docs.enola.dev/use/canonicalize/, into this, using an RFC 8785 JSON Canonicalization Scheme (JCS) -inspired (but currently not fully compliant) algorithm.

filip26 · 2025-02-20T12:10:23Z

@vorburger Please, out of curiosity, what’s the motivation for the canonicalization on top of the expanded JSON-LD form?

I’m asking because I’m experimenting with something similar to explore whether there could be a simpler and safer way to canonicalize JSON-LD without needing to go down to the RDF level or staying too high with JCS.

filip26 · 2025-02-20T12:43:02Z

Let's take this example:

{
  "@context": "http://schema.org/",
  "ref1": {
    "@id": "http://example.com/doe",
    "@type": "Person",
    "name": "Jane Doe"
  },
  "ref2": {
    "@id": "http://example.com/doe",
    "jobTitle": "Professor"
  }
}

If you use plain JCS on top of the expanded form, it won't match this:

{
  "@context": "http://schema.org/",
  "ref1": {
    "@id": "http://example.com/doe",
    "@type": "Person",
    "name": "Jane Doe",
    "jobTitle": "Professor"
  },
  "ref2": {
    "@id": "http://example.com/doe"
  }
}

But from an RDF perspective, they are the same. RDFC will produce the same result for both examples.

<http://example.com/doe> <http://schema.org/jobTitle> "Professor" .
<http://example.com/doe> <http://schema.org/name> "Jane Doe" .
<http://example.com/doe> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
_:c14n0 <http://schema.org/ref1> <http://example.com/doe> .
_:c14n0 <http://schema.org/ref2> <http://example.com/doe> .

This makes me think there might be another JSON-LD form - something between expanded and flattened. One that extracts and merges identifiable nodes while keeping blank nodes embedded as they are, preserving the tree-like structure in both cases without needing to disintegrate the entire tree to the statement level.

I wonder if anyone would be interested in something like that.

vorburger · 2025-02-22T11:33:36Z

@vorburger Please, out of curiosity, what’s the motivation for the canonicalization on top of the expanded JSON-LD form?

Hi! To be totally honest, my initial motivation for "hacking" (my) RdfCanonicalizer [which now that I looked at it again for this seems to have had an obvious bug; fixed with https://github.com/enola-dev/enola/pull/1104 ] was simply that I just wanted (needed) to use (something like) it in ModelSubject (which is a sort of "Matcher" for unit testing). The enola canonicalize CLI was just as "side effect" to "externally expose" this helper, for fun. In the future it might also be used for (H)MAC hashing for "security" related ideas.

This makes me think there might be another JSON-LD form - something between expanded and flattened. One that extracts and merges identifiable nodes while keeping blank nodes embedded as they are, preserving the tree-like structure in both cases without needing to disintegrate the entire tree to the statement level.

I only vaguely understand what you mean... but sounds interesting. But if I were you and wanted to pursue this, I would probably start a discussion about it... on https://github.com/w3c/rdf-canon/issues, might be a good place, according to https://w3c.github.io/rch-wg-charter/#communication?

I wonder if anyone would be interested in something like that.

I don't think I currently would have a need for it.

PS: TBD just FYI enola-dev/enola#1103.

filip26 · 2025-02-22T13:11:01Z

I would probably start a discussion about it... on https://github.com/w3c/rdf-canon/issues, might be a good place, according to https://w3c.github.io/rch-wg-charter/#communication?

Thank you for your answer. The motivation for canonicalization over JSON-LD, while maintaining the same level of granularity, could help mitigate key issues such as:

Blank node assignment – Challenging when using RDFC, but intrinsic to a tree structure.
Graph isomorphism – A generally NP-hard problem.

These issues can lead to potentially expensive computations and risks such as graph poisoning.

The intermediate form I envision is somewhere between JCS and RDFC - faster and safer than RDFC while remaining at the semantic level like RDFC, rather than purely syntactic like JCS. It is specifically designed for JSON-LD and tree-like structures , which is another limitation, causing zero interest from the RDF community.

From the feedback I’ve received, there seems to be interest in this approach (a canonical form is crucial for signing, verifiable credentials, etc.), but only if someone is willing to put in the effort to make it a standard. 😉

filip26 · 2025-03-10T21:22:34Z

Released! Check out v0.10.0.

filip26 added the enhancement New feature or request label Mar 9, 2024

vorburger mentioned this issue Feb 22, 2025

Replace Enola "hack" with "proper" (full) RDF Dataset Canonicalization enola-dev/enola#1103

Open

vorburger mentioned this issue Feb 22, 2025

Include titanium-rdfc in report w3c/rdf-canon#221

Open

filip26 added a commit that referenced this issue Mar 10, 2025

Add rdfc command #71

5624487

filip26 closed this as completed Mar 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RDF Dataset Canonicalization support #71

Add RDF Dataset Canonicalization support #71

filip26 commented Jan 30, 2024

vorburger commented Jul 22, 2024

filip26 commented Feb 20, 2025

filip26 commented Feb 20, 2025 •

edited

Loading

vorburger commented Feb 22, 2025 •

edited

Loading

filip26 commented Feb 22, 2025

filip26 commented Mar 10, 2025

Add RDF Dataset Canonicalization support #71

Add RDF Dataset Canonicalization support #71

Comments

filip26 commented Jan 30, 2024

vorburger commented Jul 22, 2024

filip26 commented Feb 20, 2025

filip26 commented Feb 20, 2025 • edited Loading

vorburger commented Feb 22, 2025 • edited Loading

filip26 commented Feb 22, 2025

filip26 commented Mar 10, 2025

filip26 commented Feb 20, 2025 •

edited

Loading

vorburger commented Feb 22, 2025 •

edited

Loading