-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inferred gaf (F-P) links, correct location and possible bugs #524
Comments
Doug's comments on geneontology/go-annotation#1683 make it pretty clear that the world is meant to switch to files with "prediction.gaf" names, e.g. the pombase-prediction.gaf files linked to his first comment there. In another comment Doug also says that "prediction.gaf" files are intended to replace ".inf.gaf" files, and should have the same contents But there are some big problems:
In geneontology/go-annotation#1683 (comment) Doug asked for examples of annotations we would expect to see inferred. So here are a couple:
|
So just to be clear, the inf and prediction gaf files are the same file, just renamed. Owltools is for some reason no longer producing the other predicted lines. We're getting to the bottom of why that is. |
Thanks for your patience. Here is where we are filenames and locationsfile names: you are correct about the filename transition form old to new, .inf.gaf -> -prediction.gaf locations: as Eric stated in geneontology/go-annotation#1683 (comment)
The http://build.berkeleybop.org jobs are the old pipeline. Apologies if this is confusing while we're in transition. lack of inferencesThe aspect in the generated GAFs may not be correct, see separate ticket: #524 We believe the other issues are to do with a recent ontology change we hope to have this figured out soon UPDATE I am still a bit baffled. There is some change in the ontology that caused us to drop inferences. However, when I try and reduce it to a small test case, the inferences always succeed. Further details here owlcollab/owltools#238, but we will update this ticket when we have resolved the underlying problem |
OK GOT IT We have two part_ofs in the GO import chain. The official BFO:0000050 part-of that is used directly in GO, and so#part_of that is coming from the so_import module. While these have unambiguous URIs as far as the OWLAPI is concerned, in c16 we made the decision to use labels like This means that the c16 expressions were being mapped to the wrong URI, so they weren't matching definitions in the ontology - this explains the lack of deepening 'IC' inferences. This was hard to trace - it only manifests under some contexts This doesn't necessarily explain everything, but I suspect the lack of F->P is related No fixes til monday but I can rest easier now we have found this |
Yep, this explains lack of F->P too - see BasicAnnotationPropagator.getDirectLinkedClasses |
this is because SO has fake part-ofs that are confusing both to curators and unfortunately to software (this will be fixed) see owlcollab/owltools#238 this is a partial fix to geneontology/go-site#524 however, the software needs to be more more robust
Fabulous! Let me know when it all settles down and I'll run through my checks. |
The fix should have propagated and you should have had a shiny new report.. But the pipeline encountered a new class of error in one of the GAFs: geneontology/go-annotation#1797 The pipeline 'fails fast' in these scenarios. We will take this class of aspect error and instead strip the line and report, rather than hard failing. @dougli1sqrd We should have this resolved soon |
When all the inferred reports are bright and shiny, please send out a message to the whole GOC with their new location. Then everyone who is still pointing their loading scripts at any other location can update accordingly. Thanks. |
OK what is the current status? |
Sorry for the late response we were at a meeting last 2 days. @ValWood can you check The changes to the ontology have percolated through now, I have not thoroughly checked the file but all flavors of inference appear to be present. Don't bookmark this URL yet. We are still finalizing the URLs for the reporting part of the new pipeline. @dougli1sqrd will provide full details when he is back on monday |
Sort of - the Aspect column has "C" in a lot of cases where the term isn't a component term. Lke:
GO:0006412 is a process term. |
Is the file ready to be checked. We are still not sure if we should be using the file from the new location yet? Once this is clear I can check. |
@ValWood This should be the forever home of the prediction GAF in the new pipeline: |
@ValWood When you've had a chance to proof this file: please report back. :) I'll switch our pull of MFBP data to the analogous location for the TAIR file if yours passes your stringent tests. http://snapshot.geneontology.org/products/annotations/tair-prediction.gaf |
Note to self: cronjob 14 is the one that needs updating |
OK, we will switch to use the new file location since it has the same contents as the old file that we are currently using. I will check the file contents and report any issues clearly in new tickets. As I do this I'll close off any duplicate issues, because there are often multiple issues per ticket, and long threads. For me, this ticket can close if people are happy that the new location is the correct file to use. There are still data issues, more soon.... |
OK @tberardini here is my summary: There still appears to be an issue of missing annotation and reduced file size Midori’s ticket There is a shiny new ticket for redundant annotations: There are problems with evidence code transfer, discussed in Rachel’s ticket here: Some standard GO syntax/consistency checks required We are currently filtering the component annotations for PomBase at present because we think the evidence codes are misleading geneontology/go-annotation#1487. I think otherwise the files are good to use, but the redundancy is annoying and it does not contain all of the annotations we would expect. We can close this ticket because the issues are all covered in clearer tickets.
|
Can this be closed...only need to let everyone know about the file location move |
New file location is? |
Maybe this can be closed? |
@ValWood what was the action ? You want the file location of which file ? |
Actually, maybe there is an action required reading the thread. http://build.berkeleybop.org/job/gaf-check-pombase/lastSuccessfulBuild/artifact/gene_association.pombase.inf.gaf Seth said: monthly is I think all of the problems with contents are in other tickets. |
I don't have permissions to do this. Could you @pgaudet |
Chris covered new file locations at the meeting. It might be good to let people know about these changes via the mailing list. At least if the old files are removed it will force people to update..... |
OK, I think we can close this? |
We are currently getting the file from here:
http://build.berkeleybop.org/job/gaf-check-pombase/lastSuccessfulBuild/artifact/gene_association.pombase.inf.gaf
I don't know if this is the correct place. Can you confirm.
I notice a number of issues with this file.
The major one so far is that it contains many IC annotations to process terms with an F in the aspect column.
see for example
PomBase SPBC2F12.13 klp5 GO:1990758 PMID:21664573 IC GO:0005515 F kinesin-8 family plus-end microtubule motor Klp5 sot1 protein taxon:4896 20161018 GOC-OWL part_of(GO:1990758)
(there are many more)
I'll document the other problems once we have established that we are using the correct file.
The text was updated successfully, but these errors were encountered: