Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define when it is OK to subclass terms in another ontology #1991

Open
cmungall opened this issue Jul 18, 2022 · 10 comments
Open

Define when it is OK to subclass terms in another ontology #1991

cmungall opened this issue Jul 18, 2022 · 10 comments
Labels
external resource Issues related to interactions with external (non-Foundry) resources policy Issues and discussion related to OBO Foundry policies

Comments

@cmungall
Copy link
Contributor

cmungall commented Jul 18, 2022

This is a companion issue to:

But that issue focuses on injection which I define as adding axioms about terms another ontology (this is clearly defined in that issue, don't bring the discussion back here)

This issue is about when it is OK to make axioms that are not about terms in another ontology but that reference them in subClassOf axioms, in particular subClassOf between named classes.

On the surface this should be OK - I am an not altering the target ontology axioms in any way. Indeed some ontologies such as COB and BFO and CARO are designed expressly with the intention they are subclassed. To a certain extent uberon is too, although only for species-specific subclasses.

However, subclassing others ontologies is rampant in OBO, and this is actually harmful. It is poor modularity and it leads to confusion about scope. Users are not clear which ontology to go to get a term or to request a term.

It is also terrible for maintainability. If I maintain an ontology O1, containing class C1, and another ontology O2 starts makes subclasses, C1a, C1b, and so on. Then if I later need to introduce subclasses in O1, I need to first scan all OBO to see who has made subclasses and coordinate with these ontologies. This places a large impediment for maintainability.

Here is an example of what I call a heavily chequered inter-ontology subclass pattern, where there is a lack of clarity (to an external user about what belongs in STATO, OBI, or IAO):

subject predicate object subject_label predicate_label object_label
STATO:0000002 rdfs:subClassOf IAO:0000030 digital file subClassOf information content entity
STATO:0000003 rdfs:subClassOf OBI:0500000 balanced design subClassOf study design
STATO:0000005 rdfs:subClassOf OBI:0500000 single factor design subClassOf study design
STATO:0000007 rdfs:subClassOf IAO:0000573 axis subClassOf line graph
STATO:0000010 rdfs:subClassOf IAO:0000030 coordinate system subClassOf information content entity
STATO:0000026 rdfs:subClassOf IAO:0000400 cartesian spatial coordinate origin subClassOf cartesian spatial coordinate datum
STATO:0000027 rdfs:subClassOf OBI:0000673 test of association between categorical variables subClassOf statistical hypothesis test
STATO:0000028 rdfs:subClassOf IAO:0000109 measure of variation subClassOf measurement datum
STATO:0000029 rdfs:subClassOf IAO:0000109 measure of central tendency subClassOf measurement datum
STATO:0000031 rdfs:subClassOf OBI:0200000 binary classification subClassOf data transformation
STATO:0000034 rdfs:subClassOf IAO:0000027 model parameter subClassOf data item
STATO:0000036 rdfs:subClassOf IAO:0000027 outlier subClassOf data item
STATO:0000038 rdfs:subClassOf OBI:0000181 matched pair of subjects subClassOf population
STATO:0000039 rdfs:subClassOf IAO:0000109 statistic subClassOf measurement datum
STATO:0000040 rdfs:subClassOf IAO:0000184 MA plot subClassOf scatter plot
STATO:0000044 rdfs:subClassOf OBI:0200201 one-way ANOVA subClassOf ANOVA
STATO:0000045 rdfs:subClassOf OBI:0200201 two-way ANOVA subClassOf ANOVA
STATO:0000046 rdfs:subClassOf OBI:0500000 block design subClassOf study design
STATO:0000047 rdfs:subClassOf IAO:0000109 count subClassOf measurement datum
STATO:0000048 rdfs:subClassOf OBI:0200201 multiway ANOVA subClassOf ANOVA
STATO:0000063 rdfs:subClassOf IAO:0000027 genomic coordinate datum subClassOf data item
STATO:0000065 rdfs:subClassOf IAO:0000030 hypothesis subClassOf information content entity
STATO:0000066 rdfs:subClassOf IAO:0000037 Cleveland dot plot subClassOf dot plot
STATO:0000068 rdfs:subClassOf IAO:0000027 skewness subClassOf data item

(truncated)

To replicate with OAK:

stato roots -p i --id-prefix STATO | stato relationships - -p i

Proposal:

Ontologies MUST NOT create is-a children of classes in other ontologies in their own ontology, unless permission explicitly granted, on a per-term, per-branch, or per-ontology basis. This would be recorded in OBO metadata, e.g. for COB, BFO, CARO. OBI could choose to grant permission in this way, preferably with a link to some kind of documentation that states the relative scope of the two ontologies.

@cmungall
Copy link
Contributor Author

Here is a visual illustration of the problem:

stato-obi-iao

I'm not sure how IAO/STATO/OBI coordinate which term goes where, but this is very confusing for a user who either needs to select terms, even more so if they need to figure out which issue tracker to go to in order to select new terms

@matentzn
Copy link
Contributor

I not only like this, I think it is very necessary and already reflected by the "Scope" principle (which is not very well fleshed out right now, https://obofoundry.org/principles/fp-005-delineated-content.html). This is how I would like to attack it:

  1. All major branches are reflected in COB (data transformation, study design, measurement datum, disease, anatomical entity etc). COB metadata points (maps) to all branches in active OBO ontologies, which establishes the ontologies which have theoretical permission to host terms. For example, DO, NCIT, Mondo disease branches point to COB:disease and can all serve as hosts for new terms for now. (We probably have to document all current violations as exceptions for the time being and work them out one by one (think OMIT/BTO classes and application ontologies). )
  2. We implement the rule you suggest (MUST NOT subclass), and add it to OBO dashboard.
  3. From that point on, subclassing a term from a different namespace (other than COB, RO, BFO) can only happen with a specific annotation property (like exclusion reason, but "subclass permission") which points to a resolvable issue tracker items that explains the exception.

It is important to implement this rule independent of all existing violations. We have to improve this moving forward and not forever point to existing violations as reasons for not moving on.

@dosumis
Copy link
Contributor

dosumis commented Jul 18, 2022

Unless permission explicitly granted, on a per-term, per-branch, or per-ontology basis.

This is critical. PCL defines subclasses of CL terms. Single species AOs subclass Uberon and CL...

Big ask to require this for every subClassOf axiom that breaks the rule:

From that point on, subclassing a term from a different namespace (other than COB, RO, BFO) can only happen with a specific annotation property (like exclusion reason, but "subclass permission") which points to a resolvable issue tracker items that explains the exception.

@dosumis
Copy link
Contributor

dosumis commented Jul 18, 2022

And why are we folding COB into this issue? Isn't point 1 above more aspiration than reality for many ontology branches? (e.g. see issues around anatomical entities)

@nlharris nlharris added policy Issues and discussion related to OBO Foundry policies external resource Issues related to interactions with external (non-Foundry) resources labels Jul 18, 2022
@hoganwr
Copy link
Contributor

hoganwr commented Jul 19, 2022 via email

@cmungall
Copy link
Contributor Author

Unless permission explicitly granted, on a per-term, per-branch, or per-ontology basis.
This is critical. PCL defines subclasses of CL terms. Single species AOs subclass Uberon and CL...

Yes, I mentioned the Uberon case in the original comment. There would be an agreement that species-specific subclasses are OK by a blanket rule, but if you want to make a species-neutral subclass this should be agreed first. PCL and CL is a good example, there is obviously close coordination and clear scoping rules between these two ontologies. So there would be a pairwise agreement. But I don't think CL wants extra-ontology subclasses that are neither data-driven classifications not species-specific, until new situations arise.

@hoganwr:

IAO would either have to be very permissive

There is nothing inherently wrong with this provided there is a simple process for adding new terms, for example, template-based with clear design patterns, and many people able to merge PRs. But see below for alternatives.

If the policy had been in place prior to STATO, how would things be better?

There would be clear delineation between the two ontologies. There's lots of ways to do this:

  • OBI has physical entities, IAO has information
  • IAO could itself be modularized into different domains:
    1. core upper level, with guidelines on how to subclass
    2. ICEs that shadow physical properties
    3. statistical and mathematical concepts
    4. bibliographic entities
    5. legal and social entities

But simply having IAO have all information coupled with a simple process for adding new terms would be better than the current situation, with the striping between ontologies.

I am aware of some reasons why the current situation arose, I am not criticizing past decisions, but we need to move beyond these and implement clear modularity and scoping.

@alanruttenberg
Copy link
Member

Just became aware of this issue. I'll register a strong objection. Let's quit proposing rules that limit what developers can put in their ontology.

@addiehl
Copy link

addiehl commented May 25, 2023

Have to agree with Alan, as most of my group's ontologies build off other ontologies. In some cases we have requested new classes from appropriate ontologies, but in other cases our classes are probably too specific for inclusion in a higher level domain ontology.

@cmungall
Copy link
Contributor Author

@addiehl can you describe some of the processes you have put in place to avoid some of the issues highlighted here? It would be great to have documentation and SOPs on this and very much in the spirit of my original request!

@addiehl
Copy link

addiehl commented May 26, 2023

I have a number of examples to describe, but don't have time until next week to write this up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external resource Issues related to interactions with external (non-Foundry) resources policy Issues and discussion related to OBO Foundry policies
Projects
None yet
Development

No branches or pull requests

7 participants