Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(rdf): included custom_rdf advanced setting #4641

Merged
merged 3 commits into from
Jan 27, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 23 additions & 2 deletions docs/molgenis/dev_rdf.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,28 @@ The Resource Description Format (RDF) is the W3C standard for web data.
MOLGENIS EMX2 can export data in this format by using the RDF API. This API is located at `<server>/api/rdf` where `<server>` is the location where your MOLGENIS is hosted.
For instance, if your MOLGENIS runs at `https://emx2.test.molgenis.org`, the RDF API is located at `https://emx2.test.molgenis.org/api/rdf`.

## RDF data retrieval
## configuration

The RDF API can be fine-tuned on a schema-based level by adding an advanced setting called `custom_rdf` (some templates might already add this).
If this advanced setting is not present, the RDF API functions based on the default behaviour (f.e. the default list of namespaces).

**The value of the `custom_rdf` advanced setting requires valid Turtle-formatted RDF!**

When this advanced setting is set, it will result in the following:
* The default namespaces will be ignored (except for the schema-specific namespace which is always present).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you not want to override the default namespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably depends on how big the default list is. For example, if it includes a lot of namespaces not relevant for that schema you might want to be able to exclude namespaces as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative is building in a check where we exclude unused namespaces (maybe this might be a very easy toggle in RDF4J?). Either way, I think the bigger question is if we want to rely on this at all or move to schema-specific namespaces where the default list only functions as a "fallback".

If it's only a fallback, this approach would suffice I think. If we want to "build upon a hardcoded default list" within each schema instead, then indeed we should change this behaviour.

* Namespaces defined in the `custom_rdf` will used in the RDF API's output.
* If a `custom_rdf` is set without any namespaces, this means only the schema-specific namespace will be present in the RDF output.
* Any triples in `custom_rdf` will be added to every API call that is part of that schema.

Some notes on multi-schema API calls:
* The triples in `custom_rdf` of all selected schema's will be combined.
* If any of the schemas does not have a `custom_rdf`, the default namespaces will be included as well.
* Conflicts in namespaces will not break the RDF output but might result in unexpected behaviour. Examples include:
* If 2 different namespaces use the same prefix, only one of them will use that prefix while the other simply returns full IRIs in the API output.
* If 2 identical namespaces exist with a different prefix, only one of the prefixes will be used for all IRIs belonging to that namespace.


## data retrieval
RDF API retrieve data in different scopes ranging from broad (retrieve everything) to narrow (retrieve one row).
All data is exported as a stream, which means that the response does not include a size estimate.
Listed below are the available options.
Expand All @@ -31,7 +52,7 @@ For example: `<server>/pet%20store/api/rdf/Pet/column/name`
The rows from a table within a schema can be filtered based on a column value by adding these as a `key=value` pair to a URL that also contains schema and table name: `<server>/<schema>/api/rdf/<table>?<column-name>=<value>`.
For example: `<server>/pet%20store/api/rdf/Pet?category=cat`

## RDF data formats
## data formats
Using the content negotiation, RDF can be exported in one of many available formats. For example the following curl command will download the pet store in jsonld:

`curl -H 'Accept: application/ld+json' <server>/pet%20store/api/rdf`
Expand Down
Loading