Ensure that our provenance data model is easy to populate consistently #79

cmungall · 2021-11-03T23:46:36Z

Following on from #60

As a general comment on the overall OMO work, I am excited to see these properties standardized and a rich data model that is enforced. But I think it's very important to think about something that is realistic for editors to populate with consistent depth and breadth.

It's detrimental to have something partially populated, either within or across ontologies, as users will get inconsistent results or assume a closed world interpretation. We already have very spotty use of many ambitious IAO properties such as curation-status.

Automation could help here - for example if someone comments on a github ticket that is about a term it could be propagated to a contributor annotation, but there would have to be guards against false positives.

This issue can be closed as it's not very actionable, but I just wanted to state the position that it may be better to go for a data model that is dumber and less granular with fewer nuanced distinctions that has a better hope of consistent population.

bpeters42 · 2021-11-03T23:56:52Z

I was surprised to see you raising this issue just minutes after saying in #60 that it is important to capture "provenance for a definition some mix of primary, secondary, tertiary sources, individuals, and groups of people", and wanting to do that at the axiom level.

I would much prefer what you recommend in the last paragraph here: "a data model that is dumber and less granular with fewer nuanced distinctions".

bpeters42 · 2021-11-03T23:59:42Z

So at a minimum, if some projects want to make more nuanced distinctions, it needs to be possible to map those down to a compatible, simple format. E.g. making distinctions between types of contributors for a project, but having a uniform way of reporting all contributors. Hope this is still making sense.

matentzn · 2021-11-05T11:55:56Z

One way to do best off both worlds:

we decide on a dumbed-down, standardised OBO metadata model which we seek to apply to all ontologies.
We leave everyone to curate to the granularity they wish, but provide a transformation process that can turn for example something like IAO:117 to dc:contributor at release time.
We deliver this through a new method in ROBOT

So what this ticket here should do, is decide which properties we want to populate consistently across OBO. This is my first suggestion, ordered by what I perceive is importance:

rdfs:label (label)
IAO:0000115 (definition)
dc:contributor (attribution use case)
rdfs:isDefinedBy (provenance)
rdfs:seeAlso (link related Github issues)
dc:source as an AP on IAO:115 (definition)
...large gap of importance...
dc:created (date the term came into being)
skos:*Match (for all occurrences of matches)

matentzn · 2022-02-22T10:28:35Z

Note to self:

consider Contributor role model? CRO/Credit ontology
add prov:derivedFrom to list above

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure that our provenance data model is easy to populate consistently #79

Ensure that our provenance data model is easy to populate consistently #79

cmungall commented Nov 3, 2021 •

edited

Loading

bpeters42 commented Nov 3, 2021

bpeters42 commented Nov 3, 2021

matentzn commented Nov 5, 2021

matentzn commented Feb 22, 2022

Ensure that our provenance data model is easy to populate consistently #79

Ensure that our provenance data model is easy to populate consistently #79

Comments

cmungall commented Nov 3, 2021 • edited Loading

bpeters42 commented Nov 3, 2021

bpeters42 commented Nov 3, 2021

matentzn commented Nov 5, 2021

matentzn commented Feb 22, 2022

cmungall commented Nov 3, 2021 •

edited

Loading