GOAT: GO Annotation with the Transformer model

This is our paper.

Libraries needed

pytorch, pytorch-transformers, nvidia-apex

Where are pre-trained models?

We adapt the Transformer neural network model to predict GO labels for protein sequences. We trained our method on DeepGO datasets which was used as a baseline in our paper. You can download our trained models here.

During training, we saved the model at each checkpoint. Once we finished, we kept only checkpoint that works best with the dev datasets. You will see these saved files in the format checkpoint-number.

The config.json shows how the Transformer model was trained. Please see this demo script that shows to use a trained model to evaluate a test set, and how to explore some of the model properties.

How to train your model?

You can train your own model. Your input must match the input here. The high-level format is

protein_name \t sequence \t label \t protein_vector_from_external_source \t domain_motif_in_sequence

We support 4 training options:

Base Transformer
Domain data (like motifs, compositional bias, etc.)
External protein data (like 3D structure, protein-protein interaction network)
Any combination of the above.

You can download the most updated manually annotated data at Uniprot.org. The site also provides all known motifs and domains for a given sequence. You may have to do a custom download from Uniprot for these extra information.

We do not have the pre-trained encoder in DeepGO that provides embeddings for any proteins in a protein-protein interaction network.

We do have the pre-trained encoder that provides embeddings representing 3D structures of proteins.

Name		Name	Last commit message	Last commit date
Latest commit History 366 Commits
.vscode		.vscode
ARCHIVE_CODE		ARCHIVE_CODE
AnalyzeGoVec		AnalyzeGoVec
BERT/pytorch_transformers		BERT/pytorch_transformers
Data		Data
MakeTable		MakeTable
Plot		Plot
SeeAttention		SeeAttention
SummaryStatistics		SummaryStatistics
TrainModel		TrainModel
TransformerModel		TransformerModel
bilmAminoAcidEncoder		bilmAminoAcidEncoder
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GOAT: GO Annotation with the Transformer model

This is our paper.

Libraries needed

Where are pre-trained models?

How to train your model?

About

Releases

Packages

Languages

datduong/GOAnnotationTransformer

Folders and files

Latest commit

History

Repository files navigation

GOAT: GO Annotation with the Transformer model

This is our paper.

Libraries needed

Where are pre-trained models?

How to train your model?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages