-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RefSeq regular import of publications from WormBase #175
Comments
|
I think I've found a fairly easy way, going to the wormbase FTP site, and get this file: ftp://ftp.wormbase.org/pub/wormbase/releases/WS280/species/c_elegans/PRJNA13758/annotation/c_elegans.PRJNA13758.WS280.reuters_citation_index.xml.gz For C. elegans you can get the organism, pubmedID and gene like this (it is an XML file, so a simple XML parsing should work okay in a script): zcat wormbase/releases/current-development-release/species/c_elegans/PRJNA13758/annotation/c_elegans.PRJNA13758.*.reuters_citation_index.xml.gz | grep -e '' -e 'citation pubmed_id' -e 'record record_id=' At the moment we only have this file for C. elegans, but we might be able to extend it to the other species too. For each gene, you will have listed a number of publications associated with that gene. Would you be happy with this data? |
RefSeq would like to set up a regular import of publications from WormBase that can be used to better link gene and publication records.
We already have a process in place with MGI, RGD, and ZFIN that pulls in publication links off their respective FTP sites.
Is there a way we could do that from WormBase to get gene:publication pairs, with publication identifiers that we can convert to PMIDs?
We looked around in your downloads, and didn't find anything, but may not have been looking in the right spot.
I do see that references can be downloaded for individual genes but was looking for a bulk download option.
The text was updated successfully, but these errors were encountered: