GitHub - rtaylorzfin/8395-uniprot-notebook

Reports for ZFIN-8395 (UniProt Load)

Prerequisites

Gathering Data

First, this report requires the raw data to work with. Some are exported from our zfin database and some are downloads. The prerequisites are partially compiled using the export.sh bash script.

The tables we export are: db_link, marker, feature_marker_relationship, external_note (some are limited to only certain entries by sql quere clauses).

The tables we download are: uniprot2ensembl.csv (https://ftp.expasy.org/databases/uniprot/current_release/knowledgebase/idmapping/by_organism/DANRE_7955_idmapping.dat.gz) to_keep.csv (provided by dushy at https://ftp.ebi.ac.uk/pub/contrib/dushi/zfin/to_keep.dat) to_delete.csv (provided by dushy at https://ftp.ebi.ac.uk/pub/contrib/dushi/zfin/to_delete.dat) ncbi_map (https://ftp.ncbi.nlm.nih.gov/gene/DATA/GENE_INFO/Non-mammalian_vertebrates/Danio_rerio.gene_info.gz) refseq2ncbi.csv (created using gradle -DapiKey=3b0a6a579791d9b134b44c6559fbb4434708 -DncbiLoadInput=./to_keep_primary_ids.txt -DncbiLoadOutput=./refseq2ncbi batchNCBIFetchByRefSeqTask ) (see: zfin-8395 branch in rtaylorzfin)

In total, there should be the following tables to start out: db_link, refseq2ncbi, extnote_note, to_delete, feature_marker_relationship, to_keep, marker, uniprot2ensembl, ncbi_map. All are included in the file zfin-db-slice.db.gz.

Creating sqlite DB from Data

The above data files are imported into an sqlite database with csv-to-sqlite. The export.sh file shows some examples of doing that.

Running Reports

Running in the cloud:

These reports can be created dynamically using google's colaboratory at: https://colab.research.google.com/drive/1-ublRmMiadE1kX-3T9x4NkebqUqWw3LY#scrollTo=m4OQhvxYta5S

Running locally:

This can also be run locally using jupyter. Install prerequisites:

pip3 install jupyterlab
pip3 install notebook
pip3 install pandas sqlalchemy ipython-sql
pip3 install openpyxl
pip3 install 'nbconvert[qtpdf]'

Then run "jupyter notebook" in this directory. Running locally has the advantage of better exports. I have had success with exporting to html, then opening that html with MS Word and saving as docx.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
db		db
inputs		inputs
outputs		outputs
utils		utils
8395 - UniProt Data.xlsx		8395 - UniProt Data.xlsx
Meeting Notes.xlsx		Meeting Notes.xlsx
README.md		README.md
notebook_template.ipynb		notebook_template.ipynb
uniprot_8395.ipynb		uniprot_8395.ipynb
uniprot_meeting_2022_01_12.ipynb		uniprot_meeting_2022_01_12.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reports for ZFIN-8395 (UniProt Load)

Prerequisites

Gathering Data

Creating sqlite DB from Data

Running Reports

Running in the cloud:

Running locally:

About

Releases

Packages

Languages

rtaylorzfin/8395-uniprot-notebook

Folders and files

Latest commit

History

Repository files navigation

Reports for ZFIN-8395 (UniProt Load)

Prerequisites

Gathering Data

Creating sqlite DB from Data

Running Reports

Running in the cloud:

Running locally:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages