Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that our provenance data model is easy to populate consistently #79

Open
cmungall opened this issue Nov 3, 2021 · 4 comments
Open

Comments

@cmungall
Copy link
Contributor

cmungall commented Nov 3, 2021

Following on from #60

As a general comment on the overall OMO work, I am excited to see these properties standardized and a rich data model that is enforced. But I think it's very important to think about something that is realistic for editors to populate with consistent depth and breadth.

It's detrimental to have something partially populated, either within or across ontologies, as users will get inconsistent results or assume a closed world interpretation. We already have very spotty use of many ambitious IAO properties such as curation-status.

Automation could help here - for example if someone comments on a github ticket that is about a term it could be propagated to a contributor annotation, but there would have to be guards against false positives.

This issue can be closed as it's not very actionable, but I just wanted to state the position that it may be better to go for a data model that is dumber and less granular with fewer nuanced distinctions that has a better hope of consistent population.

@bpeters42
Copy link

I was surprised to see you raising this issue just minutes after saying in #60 that it is important to capture "provenance for a definition some mix of primary, secondary, tertiary sources, individuals, and groups of people", and wanting to do that at the axiom level.

I would much prefer what you recommend in the last paragraph here: "a data model that is dumber and less granular with fewer nuanced distinctions".

@bpeters42
Copy link

So at a minimum, if some projects want to make more nuanced distinctions, it needs to be possible to map those down to a compatible, simple format. E.g. making distinctions between types of contributors for a project, but having a uniform way of reporting all contributors. Hope this is still making sense.

@matentzn
Copy link
Contributor

matentzn commented Nov 5, 2021

One way to do best off both worlds:

  1. we decide on a dumbed-down, standardised OBO metadata model which we seek to apply to all ontologies.
  2. We leave everyone to curate to the granularity they wish, but provide a transformation process that can turn for example something like IAO:117 to dc:contributor at release time.
  3. We deliver this through a new method in ROBOT

So what this ticket here should do, is decide which properties we want to populate consistently across OBO. This is my first suggestion, ordered by what I perceive is importance:

  • rdfs:label (label)
  • IAO:0000115 (definition)
  • dc:contributor (attribution use case)
  • rdfs:isDefinedBy (provenance)
  • rdfs:seeAlso (link related Github issues)
  • dc:source as an AP on IAO:115 (definition)
  • ...large gap of importance...
  • dc:created (date the term came into being)
  • skos:*Match (for all occurrences of matches)

@matentzn
Copy link
Contributor

Note to self:

  • consider Contributor role model? CRO/Credit ontology
  • add prov:derivedFrom to list above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants