-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export to rdf/xml #53
Merged
Merged
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
fe2125a
Added id as optional string to definitions
JosePizarro3 17c32f6
Added rdflib to dependencies
JosePizarro3 7e9e95f
Adding referenceTo and fixing properties of data type OBJECT
JosePizarro3 9780764
Ignoring typing in to_rdf function
JosePizarro3 48b9c3a
Fix pydantic versioning problem for _base_attrs
JosePizarro3 7fc6eb8
Added dataType and propertyLabel annotations
JosePizarro3 0a6f19a
Fix descriptions of annotated properties
JosePizarro3 fbe9ba3
Adding back the placeholders for object, collections and datasets
JosePizarro3 a82d839
Renamed model_to_rdf for BaseEntity
JosePizarro3 02ab05e
Fixed code_to_class_name when the code does not exist
JosePizarro3 66cd1b4
Added docstrings to entities_to_rdf.py
JosePizarro3 c6f078d
Fix duplicated property problems
JosePizarro3 aa69f61
Moved duplicated_property_types to utils and added tests
JosePizarro3 11383d6
Added testing for entities_to_rdf
JosePizarro3 8c342d3
Fix imports
JosePizarro3 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,236 @@ | ||
import inspect | ||
from typing import TYPE_CHECKING | ||
|
||
if TYPE_CHECKING: | ||
from rdflib import Graph | ||
from structlog._config import BoundLoggerLazyProxy | ||
|
||
import click | ||
from rdflib import BNode, Literal, Namespace | ||
from rdflib.namespace import DC, OWL, RDF, RDFS | ||
|
||
from bam_masterdata.utils import code_to_class_name, import_module | ||
|
||
BAM = Namespace("https://bamresearch.github.io/bam-masterdata/") | ||
PROV = Namespace("http://www.w3.org/ns/prov#") | ||
|
||
|
||
def rdf_graph_init(g: "Graph") -> None: | ||
""" | ||
Initialize the RDF graph with base namespaces, annotation properties, and internal BAM properties. This | ||
function also creates placeholders for PropertyType and other entity types. The graph is to be printed out | ||
in RDF/XML format in the `entities_to_rdf` function. | ||
|
||
Args: | ||
g (Graph): The RDF graph to be initialized. | ||
""" | ||
# Adding base namespaces | ||
g.bind("dc", DC) | ||
g.bind("owl", OWL) | ||
g.bind("rdf", RDF) | ||
g.bind("rdfs", RDFS) | ||
g.bind("bam", BAM) | ||
g.bind("prov", PROV) | ||
|
||
# Adding annotation properties from base namespaces | ||
annotation_props = [ | ||
RDFS.label, | ||
RDFS.comment, | ||
DC.identifier, | ||
] | ||
for prop in annotation_props: | ||
g.add((prop, RDF.type, OWL.AnnotationProperty)) | ||
|
||
# Custom annotation properties from openBIS: `dataType`, `propertyLabel | ||
custom_annotation_props = { | ||
BAM[ | ||
"dataType" | ||
]: """Represents the data type of a property as defined in the openBIS platform. | ||
This annotation is used to ensure alignment with the native data types in openBIS, | ||
facilitating seamless integration and data exchange. | ||
|
||
The allowed values for this annotation correspond directly to the openBIS type system, | ||
including BOOLEAN, CONTROLLEDVOCABULARY, DATE, HYPERLINK, INTEGER, MULTILINE_VARCHAR, OBJECT, | ||
REAL, TIMESTAMP, VARCHAR, and XML. | ||
|
||
While `bam:dataType` is primarily intended for internal usage with openBIS, mappings to | ||
standard vocabularies such as `xsd` (e.g., `xsd:boolean`, `xsd:string`) are possible to use and documented to | ||
enhance external interoperability. The full mapping is: | ||
- BOOLEAN: xsd:boolean | ||
- CONTROLLEDVOCABULARY: xsd:string | ||
- DATE: xsd:date | ||
- HYPERLINK: xsd:anyURI | ||
- INTEGER: xsd:integer | ||
- MULTILINE_VARCHAR: xsd:string | ||
- OBJECT: bam:ObjectType | ||
- REAL: xsd:decimal | ||
- TIMESTAMP: xsd:dateTime | ||
- VARCHAR: xsd:string | ||
- XML: xsd:string""", | ||
BAM[ | ||
"propertyLabel" | ||
]: """A UI-specific annotation used in openBIS to provide an alternative label for a property | ||
displayed in the frontend. Not intended for semantic reasoning or interoperability beyond openBIS.""", | ||
} | ||
for custom_prop, custom_prop_def in custom_annotation_props.items(): | ||
g.add((custom_prop, RDF.type, OWL.AnnotationProperty)) | ||
g.add( | ||
( | ||
custom_prop, | ||
RDFS.label, | ||
Literal(f"bam:{custom_prop.split('/')[-1]}", lang="en"), | ||
) | ||
) | ||
g.add((custom_prop, RDFS.comment, Literal(custom_prop_def, lang="en"))) | ||
|
||
# Internal BAM properties | ||
# ? `section`, `ordinal`, `show_in_edit_views`? | ||
bam_props_uri = { | ||
BAM["hasMandatoryProperty"]: [ | ||
(RDF.type, OWL.ObjectProperty), | ||
# (RDFS.domain, OWL.Class), | ||
(RDFS.range, BAM.PropertyType), | ||
(RDFS.label, Literal("hasMandatoryProperty", lang="en")), | ||
( | ||
RDFS.comment, | ||
Literal( | ||
"The property must be mandatorily filled when creating the object in openBIS.", | ||
lang="en", | ||
), | ||
), | ||
], | ||
BAM["hasOptionalProperty"]: [ | ||
(RDF.type, OWL.ObjectProperty), | ||
# (RDFS.domain, OWL.Class), | ||
(RDFS.range, BAM.PropertyType), | ||
(RDFS.label, Literal("hasOptionalProperty", lang="en")), | ||
( | ||
RDFS.comment, | ||
Literal( | ||
"The property is optionally filled when creating the object in openBIS.", | ||
lang="en", | ||
), | ||
), | ||
], | ||
BAM["referenceTo"]: [ | ||
(RDF.type, OWL.ObjectProperty), | ||
(RDFS.domain, BAM.PropertyType), # Restricting domain to PropertyType | ||
# (RDFS.range, OWL.Class), # Explicitly setting range to ObjectType | ||
(RDFS.label, Literal("referenceTo", lang="en")), | ||
( | ||
RDFS.comment, | ||
Literal( | ||
"The property is referencing an object existing in openBIS.", | ||
lang="en", | ||
), | ||
), | ||
], | ||
} | ||
for prop_uri, obj_properties in bam_props_uri.items(): | ||
for prop in obj_properties: # type: ignore | ||
g.add((prop_uri, prop[0], prop[1])) # type: ignore | ||
|
||
# Adding base PropertyType and other objects as placeholders | ||
# ! add only PropertyType | ||
prop_type_description = """A conceptual placeholder used to define and organize properties as first-class entities. | ||
PropertyType is used to place properties and define their metadata, separating properties from the | ||
entities they describe. | ||
|
||
In integration scenarios: | ||
- PropertyType can align with `BFO:Quality` for inherent attributes. | ||
- PropertyType can represent `BFO:Role` if properties serve functional purposes. | ||
- PropertyType can be treated as a `prov:Entity` when properties participate in provenance relationships.""" | ||
for entity in ["PropertyType", "ObjectType", "CollectionType", "DatasetType"]: | ||
entity_uri = BAM[entity] | ||
g.add((entity_uri, RDF.type, OWL.Thing)) | ||
g.add((entity_uri, RDFS.label, Literal(entity, lang="en"))) | ||
if entity == "PropertyType": | ||
g.add((entity_uri, RDFS.comment, Literal(prop_type_description, lang="en"))) | ||
|
||
|
||
def entities_to_rdf( | ||
graph: "Graph", module_path: str, logger: "BoundLoggerLazyProxy" | ||
) -> None: | ||
""" | ||
Convert the entities defined in the specified module to RDF triples and add them to the graph. The function | ||
uses the `model_to_rdf` method defined in each class to convert the class attributes to RDF triples. The | ||
function also adds the PropertyType and other entity types as placeholders in the graph. | ||
|
||
Args: | ||
graph (Graph): The RDF graph to which the entities are added. | ||
module_path (str): The path to the module containing the entities to be converted. | ||
logger (BoundLoggerLazyProxy): The logger to log messages. | ||
""" | ||
rdf_graph_init(graph) | ||
|
||
module = import_module(module_path=module_path) | ||
|
||
# Special case of `PropertyTypeDef` in `property_types.py` | ||
# PROPERTY TYPES | ||
# skos:prefLabel used for class names | ||
# skos:definition used for `description` (en, de) | ||
# skos:altLabel used for `property_label` | ||
# dc:identifier used for `code` # ! only defined for internal codes with $ symbol | ||
# dc:type used for `data_type` | ||
if "property_types.py" in module_path: | ||
for name, obj in inspect.getmembers(module): | ||
if name.startswith("_") or name == "PropertyTypeDef": | ||
continue | ||
prop_uri = BAM[obj.id] | ||
|
||
# Define the property as an OWL class inheriting from PropertyType | ||
graph.add((prop_uri, RDF.type, OWL.Thing)) | ||
graph.add((prop_uri, RDFS.subClassOf, BAM.PropertyType)) | ||
|
||
# Add attributes like id, code, description in English and Deutsch, property_label, data_type | ||
graph.add((prop_uri, RDFS.label, Literal(obj.id, lang="en"))) | ||
graph.add((prop_uri, DC.identifier, Literal(obj.code))) | ||
descriptions = obj.description.split("//") | ||
if len(descriptions) > 1: | ||
graph.add((prop_uri, RDFS.comment, Literal(descriptions[0], lang="en"))) | ||
graph.add((prop_uri, RDFS.comment, Literal(descriptions[1], lang="de"))) | ||
else: | ||
graph.add((prop_uri, RDFS.comment, Literal(obj.description, lang="en"))) | ||
graph.add( | ||
(prop_uri, BAM.propertyLabel, Literal(obj.property_label, lang="en")) | ||
) | ||
graph.add((prop_uri, BAM.dataType, Literal(obj.data_type.value))) | ||
if obj.data_type.value == "OBJECT": | ||
# entity_ref_uri = BAM[code_to_class_name(obj.object_code)] | ||
# graph.add((prop_uri, BAM.referenceTo, entity_ref_uri)) | ||
if not code_to_class_name(obj.object_code, logger): | ||
logger.error( | ||
f"Failed to identify the `object_code` for the property {obj.id}" | ||
) | ||
continue | ||
entity_ref_uri = BAM[code_to_class_name(obj.object_code, logger)] | ||
|
||
# Create a restriction with referenceTo | ||
restriction = BNode() | ||
graph.add((restriction, RDF.type, OWL.Restriction)) | ||
graph.add((restriction, OWL.onProperty, BAM["referenceTo"])) | ||
graph.add((restriction, OWL.someValuesFrom, entity_ref_uri)) | ||
|
||
# Add the restriction as a subclass of the property | ||
graph.add((prop_uri, RDFS.subClassOf, restriction)) | ||
return None | ||
|
||
# All other datamodel modules | ||
# OBJECT/DATASET/COLLECTION TYPES | ||
# skos:prefLabel used for class names | ||
# skos:definition used for `description` (en, de) | ||
# dc:identifier used for `code` # ! only defined for internal codes with $ symbol | ||
# parents defined from `code` | ||
# assigned properties can be Mandatory or Optional, can be PropertyType or ObjectType | ||
# ? For OBJECT TYPES | ||
# ? `generated_code_prefix`, `auto_generated_codes`? | ||
for name, obj in inspect.getmembers(module, inspect.isclass): | ||
# Ensure the class has the `model_to_rdf` method | ||
if not hasattr(obj, "defs") or not callable(getattr(obj, "model_to_rdf")): | ||
continue | ||
try: | ||
# Instantiate the class and call the method | ||
entity = obj() | ||
entity.model_to_rdf(namespace=BAM, graph=graph) | ||
except Exception as err: | ||
click.echo(f"Failed to process class {name} in {module_path}: {err}") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the "Graph" class work? Does it inserts everything inside the file already nested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graph is the class to describe the triples in notologies, and hence when printing to RDF/XML already has the format looked for
"triples" are normally 2 nodes connected via a relationship. Something like:
(node_1, relationship, node_2)
, which you can see some examples when usingGraph.add()
. Basically it is a way of defining DAGs.Here the complication is not printing to RDF/XML, or to create the Graph, but actually mapping the openBIS info into the triples.