-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flip flopping of annotation numbers #17171
Comments
Actually I came across this, tidying our trackers |
OK this is what we had late last week: the changes are RADICAL. The only number that stays the same is 495 | (regulation of transcription, DNA-templated (GO:0006355) OR transcription, DNA-templated (GO:0006351)) 81 | ((regulation of transcription, DNA-templated (GO:0006355) OR transcription, DNA-templated (GO:0006351)) NOT transcription, DNA-templated (GO:0006351)) 414 | transcription, DNA-templated (GO:0006351) 305 | (regulation of transcription, DNA-templated (GO:0006355) AND transcription, DNA-templated (GO:0006351)) We really need to find the cause of this urgently. Imagine the knock-on effect in analyses..... Note that, as far as I am aware we did not make any changes in annotations to transcription during this time. these are the only 3 papers we approved: |
Hi Val, I think you are using the go-basic snapshot for this, correct? Are the differences connected to different snapshots? Or is it possible a different release of owltools is being used at different times? Are we talking about output from the |
Hi @balhoff I think so, but @kimrutherford would need to confirm. I believe owlcollab/owltools#256 is the same issue. |
Hi Jim.
We're using http://purl.obolibrary.org/obo/go/snapshot/go-basic.obo
We haven't changed our version of OWLTools since December. Would it help to upgrade to a newer version?
Yep! Please let me know if I can help track down the problem. I'm about to head to bed (I'm in New Zealand) but I'll do some investigating tomorrow to see if I can narrow things down. |
I don't think so. I'm just trying to understand what would be different between the runs with different output. I.e. could ontology snapshots be differing so widely from release to release? |
I don't think it's anything to do with the ontology snap-shots. This is only a hunch but in some branches where I see this effect I know there have been no changes. We have also seen the effect if we have run twice on the same ontology version (I think). There is a random element to the observations. I think it must be something to do with incomplete or arbitrary path following. |
pombase/pombase-chado#678 |
Then should we move this ticket to the annotation tracker if it's not an ontology problem? |
It isn't an annotation problem though. So it is probably better on this tracker until the cause is known. It might be an owl tools problem but at the moment we don't know quite what causes it... |
I've compared the output of The output using go-basic.obo from 2019-04-13 has these inferred relations:
Those relations are missing from the output for the OBO file from 2019-04-17 then they re-appear in the 2019-04-23 output. Maybe that would explain the flip-flop that Val saw? The two terms are:
RO:0002211 is regulates and RO:0002212 is negatively regulates. |
I did some more digging and noticed that I get different results from same version of owltools when run on a different machine. One machine has OpenJDK 11.0.2 the other has 1.8.0 The flip-flopping between go-basic-obo versions happens on both machines but in opposite directions. On one machine (with v11.0.2) go-basic.obo from 2019-04-13 has the inferred relations above but on the other (with 1.8.0) it doesn't.
I think so. |
This is getting interesting! And that is with the same input ontology? I wonder if some owltools dependencies are be out of date and not reliable on Java 11. In the past there have been mysterious classpath loading issues related to OWL API which affected parsers. Could you try using go-basic.owl instead of .obo? I would expect it to be more robust. |
I tried today's go-basic.obo using Java 8 on both my Mac laptop and a Linux server—different output! The lines you mentioned are missing from the Linux version. |
It's the same owltools and the same go-basic.obo with different versions of Java.
That wouldn't explain why the inferred relations are flip-flopping when using JDK 1.8.0
I just grabbed go-basic.obo and go-basic.owl from here: http://skyhook.berkeleybop.org/release/ontology/ I get a different output from the two files and from the two Java versions. With OpenJDK 1.8.0 and the OBO file the inferred relations are in the output, with the OWL file the two relations are missing from the output. With OpenJDK 11.0.2, the output for the OBO file doesn't contain the 2 inferred relations. The output for the OWL file does include them. |
If I run owltools differently, using the same machine (laptop) and the same Java, I get different output:
Correction—owltools-runner-all.jar was an old artifact and was just confusing the issue; running from the current jar seems to work on both Mac and Linux
These are two different packaged forms built by the owltools Update—if I save the ontology in OWL functional syntax, I get the opposite result with the two ways of running. Furthermore the relations go back to the old label style in the output instead of IDs... 😑 |
Thank you! This has been driving me crazy for about a year! |
Sounds like something we should document somewhere ? |
@pgaudet this is definitely an owltools bug. It's pretty mysterious but I'm trying to figure it out. @kimrutherford can you please test all your scenarios with this owltools: https://build.berkeleybop.org/job/owltools/1423/artifact/OWLTools-Runner/target/owltools I made a change to the way the Java code is executed. |
Hi Jim. I've tried that owltools and the results are still inconsistent. For example the output for go-basic.owl is different to that for go-basic.obo - the two inferred relations above are in the OBO file output but not in the OWL file output. And the output from the OBO files changes if I change OpenJDK version and it flip-flops between go-basic snapshots. |
@kimrutherford thanks for trying. It had worked consistently in the environments I was trying. I will send you one or two more configurations later today, if you don't mind some more testing. |
@kimrutherford @ValWood this seems to be more complicated than I thought. I am continuing to work on it but not sure how long it will take. |
@kimrutherford can you test another version of owltools? (sorry for disappearing for a little while) This version has the |
I tested the same go-basic.obo files as above, from 2019-04-16, 2019-04-17 and 2019-04-18. The output from processing the 2019-04-17 OBO file has these inferred relations:
but the output for the files from 2019-04-16 and 2019-04-18 don't have those lines. That was with Java v11. I also tried v8 and in that case, 2019-04-16 and 2019-04-17 are the same as v11 but the output for 2019-04-18 was different: it did have the two inferred relations. |
@ValWood do you still have problems ? |
The inferred relations still come and go. It last changed on Aug 15th. |
This is still a problem, and it looks really bad, and introduced a lot of noise into the annotation and potentially any analyses. I'm assuming the same bug affects GO annotation numbers internally too? I'm mentioning it again because it is quite disconcerting to see the numbers bouncing around on a daily basis when you did no annotation changes......It seems quite important to address? |
@ValWood this functionality is implemented in owltools, with a sort of ad hoc reasoning method, and no one is actively working on owltools anymore. The approach used in owltools is fine, but it would be more reliable and understandable to me if it was redone with a more modern approach using an OWL reasoner. Would you be able to define the general requirements for this, in case it would be simpler to just rewrite? |
@kimrutherford will need to describe how/what we use and why. I can only describe the problem I see in the output. I don't even know if the issue has been fully traced. My feeling is that different path are followed arbitrarily in some instances if there are multiple choices, but this only occurs when there are 'regulation of regulation' terms. Note that this issue does not affect every term, but it always affects
These two particular "slim" term annotation numbers oscillate all the time. Most others don't - I suspect this is because we do not have any "regulation of regulation" type annotations for these terms. |
Should we be using different tools? |
We use the output of owltools --save-closure-for-chado to populate the cvtermpath table in Chado. We then use that table downstream anywhere in PomBase where we need all the ancestors or descendants of a term. eg. on our query page and on term pages: https://www.pombase.org/term/GO:0006351 Hope that helps. We'd be happy to move to another tool. |
This seems to be a clear example of the 'flip-flopping' issue: Oscillation of annotation numbers in the absence of annotation changes isn't a great feature of GO :( |
Is there any news on this ticket? This is one of the issues because it will make any analyses done with GO reproducible. The annotation numbers are arbitrarily dependent on the day that the analysis was performed. |
We'd be happy to use a different tool. We've been using owltools because it has the convenient (Perhaps this isn't the right issue tracker for this? It's not a GO problem) |
But doesn't GO also use owl tools for inferences? If so it is a GO problem... |
@ValWood we have been trying to eliminate use of owltools, although it continues to be used here and there. I think it may be possible to create a |
Should have done it first, but now that I've looked into the owltools output myself, I think you want all the redundant inferences. Stay tuned. |
I'm not sure. We would need to wait for Kim to comment and he is on holiday this week... |
Okay @ValWood @kimrutherford here is a replacement you can use: https://github.com/balhoff/relation-graph It should always return the same results for the same input. :-) Download this zip: https://github.com/balhoff/relation-graph/releases/download/v1.1/relation-graph-1.1.tgz In there is a Download Run it like this (requires Java):
The output file you want is the one specified in I can follow up to help with a transformation shell script if needed. |
Thanks so much for this Jim, it's very much appreciated! I'm sure that Kim will let you know if he has any questions. I don't even understand what we need, or how you know what we need. Have a good weekend. |
No problem! For Kim's info, here is how the output will look: <http://purl.obolibrary.org/obo/GO_0018235> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://purl.obolibrary.org/obo/GO_0008152> .
<http://purl.obolibrary.org/obo/GO_0018235> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://purl.obolibrary.org/obo/GO_0008150> .
<http://purl.obolibrary.org/obo/GO_0018235> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://purl.obolibrary.org/obo/GO_0018205> .
<http://purl.obolibrary.org/obo/GO_0018235> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://purl.obolibrary.org/obo/GO_0006807> .
<http://purl.obolibrary.org/obo/GO_0018235> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://purl.obolibrary.org/obo/GO_0043170> .
<http://purl.obolibrary.org/obo/GO_0030291> <http://purl.obolibrary.org/obo/BFO_0000051> <http://purl.obolibrary.org/obo/GO_0019901> .
<http://purl.obolibrary.org/obo/GO_0008427> <http://purl.obolibrary.org/obo/BFO_0000051> <http://purl.obolibrary.org/obo/GO_0019901> .
<http://purl.obolibrary.org/obo/GO_0042556> <http://purl.obolibrary.org/obo/BFO_0000051> <http://purl.obolibrary.org/obo/GO_0019901> . |
That's great Jim. Thanks very much! I've installed relation-graph and it works for all the ontologies we use. So I've modified our loading code to understand the output and we'll be doing a full load test tonight (UK time). I'll leave this issue open until Val and Midori have had a look at the results. Thanks again. |
Sounds good. |
Thanks very much Jim. It's working very well!: |
Great news! Thanks. |
pombase/pombase-chado#722
There is something very sinister going on here:
We didn't change anything. Look at the totals for last week, vs this week.
This is an extreme example of the random flip-flopping I keep 'going -on' about. I'm convinced it is some arbitrary effect of owl tools traversing different regulates paths. Beyond that I don't know what I am talking about. But what I am seeing is a big issue.
compare
how it was
and how it is today
Note that, despite the number of annotations to "regulation of transcription" being identical, the number of annotations to transcription changes radically.
The text was updated successfully, but these errors were encountered: