Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMO has multiple syntactic issues #11

Open
cmungall opened this issue Jan 20, 2023 · 0 comments
Open

CMO has multiple syntactic issues #11

cmungall opened this issue Jan 20, 2023 · 0 comments

Comments

@cmungall
Copy link

cmungall commented Jan 20, 2023

There are a number of major syntactic issues with the CMO obo and owl releases. This causes many parsers to break, as was reported in #5.

Even in cases where parsers don't break, the results of the parse give unintended results

For example, virtually all ontology browsers and tools do not correctly interpret the definitions, because they are encoded in OBO as a property_value:

id: CMO:0003022
name: hemoglobin distribution width
is_a: CMO:0000508 ! hemoglobin measurement
property_value: created:by "sjwang" xsd:string
property_value: creation:date 2019-01-29T15:15:44Z xsd:string
property_value: hasExactSynonym "HDW" xsd:string
property_value: http://purl.obolibrary.org/obo/def "The distribution width of erythrocytes by their cellular (individual) hemoglobin concentrations. It is a measurement of the heterogeneity of the red cell hemoglobin concentration." xsd:string {http://www.geneontology.org/formats/oboInOWL#xref="PMID:PMID\\:3411196"}

This should be:

id: CMO:0003022
name: hemoglobin distribution width
is_a: CMO:0000508 ! hemoglobin measurement
created_by: "sjwang"
creation_date: 2019-01-29T15:15:44Z
synonym: "HDW" EXACT []
def: "The distribution width of erythrocytes by their cellular (individual) hemoglobin concentrations. It is a measurement of the heterogeneity of the red cell hemoglobin concentration." [PMID:3411196]

The OWL that gets generated from the current obo is both syntactically incorrect, and even when parsed, information gets missed. For example on OLS:

https://www.ebi.ac.uk/ols/ontologies/cmo/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCMO_0003022

The definition doesn't show up where it should:

image

It looks like at some point in the past you passed the obo through a troundtrip with a very old obo to owl converter.

If you like I can provide a PR that fixes CMO. I assume this is the source file: https://github.com/rat-genome-database/CMO-Clinical-Measurement-Ontology/blob/master/clinical_measurement.obo

or I can provide support to a developer to fix this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant