Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GFF annotations are not where tripal expects #8

Closed
bradfordcondon opened this issue Nov 6, 2018 · 4 comments
Closed

GFF annotations are not where tripal expects #8

bradfordcondon opened this issue Nov 6, 2018 · 4 comments
Assignees
Labels
discussion question Further information is requested

Comments

@bradfordcondon
Copy link

when working on #5 we found that i5k sotres feature annotations loaded via gFF in featureprop.

However, tripal stores annotations in feature_cvterm, at least, when they are loaded via the direct annotation importers (tripal_analysis_interpro, tripal_analysis_kegg, tripal_analysis_go).

Furthermore, tripal does indeed try to load annotations into feature_cvterm, but only if they are with the Ontology_term attribute. see:
https://github.com/tripal/tripal/blob/a2f3c1e3414011adfced99e98b15365326904792/tripal_chado/includes/TripalImporter/GFF3Importer.inc#L1432

One good reason/argument for storing annotations in feature_cvterm is the existence of the feature_cvtermprop table, which is designed to store evidence codes for annotations. You can see the chado table description makes this clear

Let's

a) look at our GFF and see what the annotations look like
b) consider counterarguments to the points i lay out above. I wouldnt be surprised if theres documentation out there supporting annotations in featureprop instead.
b) think about if we would want to convert
c) think about if Tripal's GFF importer should be smarter in how it searches annotations
d) look at what generates the GFF and if it should be printing the tags differently

@bradfordcondon bradfordcondon added question Further information is requested discussion labels Nov 6, 2018
@bradfordcondon bradfordcondon removed their assignment Apr 9, 2019
@mpoelchau
Copy link
Contributor

mpoelchau commented May 3, 2019

See also GMOD/Chado#74

I've updated our gff loading documentation to specify loading GO, Kegg and InterPro as both Ontology_term and Dbxref attributes. (The latter because 1) our current Tripal2 gene pages still expect them in featureprop, and 2) there is no obo file for InterPro so technically Tripal can't add these to feature_cvterm via the gff loader)

@bradfordcondon
Copy link
Author

there is no obo file for InterPro so technically Tripal can't add these to feature_cvterm

Tripal inserts the InterPro terms it encounters into chado.cvterm when loading Interpro results.

see:

https://github.com/tripal/tripal_analysis_interpro/blob/dc70e60775b41790d30d46fe574c2cf63e05ebcc/includes/TripalImporter/InterProImporter.inc#L585-L594

@mpoelchau
Copy link
Contributor

Right - but the tripal interpro importer works only IPRscan xml files, right (not with gff)? Unfortunately we don't always have these. We could certainly ask users to submit these along with gffs if they have them.

@bradfordcondon
Copy link
Author

bradfordcondon commented May 3, 2019

yes thats definitely true. With a little work you could simply download the altest IPR defintions and batch-load them in non-OBO form (since IPR has almost no relationships anyway)

OR, run the importer just once with XML, and youll get most of the interpro terms loaded in as cvterms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants