Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about reasoning over relationship chains in the ontology to produce annotations sets in AmiGO & MouseMine #465

Open
krchristie opened this issue Dec 4, 2023 · 3 comments

Comments

@krchristie
Copy link

krchristie commented Dec 4, 2023

I, with a lot of help from @ukemi, have a question about how the reasoning over a specific relationship chain in the "AmiGO (with regulates)" option and in MouseMine is working.

The question arises due to a question from an MGI user about repeatedly seeing discrepancies between AmiGO and MGI in the results of queries for gene sets annotated to a GO term and its children, with one specific example being the GO term “embryonic morphogenesis”.

To investigate, I obtained lists of GO annotations to get the gene lists associated with the GO term “embryonic morphogenesis” from each of these sources:

Here is the summary of my comparison of the GO terms that are present within each of the above 4 sets of annotations to the term “embryonic morphogenesis”

  • Two of these options, ii. = MGI MouseMine and iv. = AmiGO with “includes regulates” produced identical sets of GO terms.

  • The third option, iii. = AmiGO default option (without “includes regulates”), produces a smaller set of GO terms lacking the 10 regulation terms present in options ii. & iv. as expected.

  • However, option i., downloading an Excel file from MGI’s Gene Ontology Annotations page for “embryonic morphogenesis” produced a set of GO terms with 17 additional regulates terms NOT included in options ii. (MouseMine) or iv. (AmiGO with regulates).

Picture showing all differences between the 4 sets. Note that list of terms present in all 4 sets is truncated.
20231128-embryonicMorph-multiSourceComp-colored-v2

Our question: What causes the difference between option i. and options ii. & iv. The 17 terms included in option i. that are NOT included in option ii. & iv. are all regulates terms. David and I have looked carefully at two representative terms and think we may have an explanation of what is going on.

Example 1 – representative of regulation terms present in options i., ii., AND iv.

Example 2 – representative of regulation terms present ONLY in option i.

David’s recollection is that there was a conscious decision that it is not appropriate to reason over the relationship chain “regulates-over-part_of” as this chain does NOT always mean that the first term "regulates" the third term in all places where a “regulates-over-part_of” chain occurs in the ontology. We did look at the RO term “regulates (RO:0002211)” in Protégé (see attached picture) and this confirms David’s recollection that this chain is not asserted for the term “regulates”.

20231128-userQuestion-regulates-inProtege

Getting back to our Question, we would like input to know if this explanation (NOT reasoning on “regulates over part_of” relationship chains) is consistent with the reasoning over relationships that is applied in AmiGO with regulates (option iv.) and in MouseMine (option ii.) and thus a possible explanation for this discrepancy, noting that we are also assuming that the MGI Gene Ontology Annotations page (option i.) is NOT reasoning over relationships, but is just going down the chains of relationships and including ALL terms.

FAO: @balhoff @kltm

@kltm
Copy link
Member

kltm commented Dec 5, 2023

Noting that for AmiGO/GOlr, the closures used are:

  - id: isa_partof_closure
    property:
      - "getRelationIDClosure"
      - "BFO:0000050"

and

  - id: regulates_closure
    property:
      - "getRelationIDClosure"
      - "BFO:0000050"
      - "BFO:0000066"
      - "RO:0002211"
      - "RO:0002212"
      - "RO:0002213"
      - "RO:0002215"
      - "RO:0002216"

@ukemi
Copy link

ukemi commented Dec 6, 2023

Hmmm. So it looks like the closure is supposed to happen over part_of. So it is a mystery why the top terms in the spreadsheet are missing from the closure.

@suzialeksander
Copy link
Collaborator

@krchristie I'm bumping this for attention, maybe @balhoff sees something.

Also, is MGI still running MouseMine or are they switching to AllianceMine soon-ish? Let me know if I need to loop in an InterMine voice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants