-
Couldn't load subscription status.
- Fork 666
Open
Labels
enhancementIncrementally add new featureIncrementally add new feature
Description
Version
jena-5.5.0
Feature
Description:
Implement W3C RDF Dataset Canonicalization (RDFC-1.0) algorithm in Apache Jena with output in canonical N-Quads format. This enables deterministic serialization of RDF datasets by assigning canonical identifiers to blank nodes.
References:
- RDFC-1.0 Algorithm: https://www.w3.org/TR/rdf-canon/
- Canonical N-Quads Format: https://www.w3.org/TR/rdf12-n-quads/#canonical-quads
- W3C Test Suite: https://github.com/w3c/rdf-tests/tree/main/rdf/rdf12/rdf-n-quads/c14n and https://w3c.github.io/rdf-canon/tests/
Tasks:
- Create NQuadsCanonicalWriter class extending WriterDatasetRIOTBase
- Add NQUADS_CANONICAL format constant to RDFFormat
- Register canonical writer factory in RDFWriterRegistry
- Implement RDFC10Canonicalizer with complete RDFC-1.0 algorithm
- Create HashUtils for SHA-256 hash computations and lexicographic sorting
- Implement CanonicalIssuer for _c14n_N blank node identifier assignment
- Add DatasetProcessor for blank node extraction and dataset processing
- Download and integrate W3C canonicalization test suite to jena-arq/testing/rdf12-wg/rdf-n-quads-c14n/
- Update Scripts_RIOT_c14n.java test factory following existing RIOT patterns
- Implement RDFCanonicalizationTest for algorithm validation leveraging https://w3c.github.io/rdf-canon/tests/
- Add writeCanonical() and canonicalizeDataset() methods to RDFDataMgr
- Add --canonical flag support to riot command line tool
- Update documentation and create usage examples
Are you interested in contributing a solution yourself?
Yes
Metadata
Metadata
Assignees
Labels
enhancementIncrementally add new featureIncrementally add new feature