Skip to content

THUDM/PETra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PETra

Pretrained Evolutionary Transformer for SARS-CoV-2.

Inference: http://cpredict.cn

Codebase for PETRA, built on Megatron-LM.

"data_preprocess" contains codes for data preprocessing. "model_training" contains codes for model training and some supplementary codes. "results" contains experimental results. "inference" for inferencing the model "petra_article.pdf", paper introducing petra

model weights: on huggingface.

Independent "Readme.txt" about the usage of data preprocess code and model training code can be found in their directories respectively.

Performance:


Baseline: Bloom estimator: https://github.com/jbloomlab/SARS2-mut-fitness

Nucleotide mutation prediction results for PETRA

We report average and weighted recall @1, 10, and 100. In weighted measure, sequences are weighted by their representativeness.

Method Average Recall @1 @10 @100 Weighted Recall @1 @10 @100
Random Guess 0.00% 0.01% 0.08% 0.00% 0.01% 0.08%
Bloom 0.45% 1.50% 9.15% 0.49% 1.48% 9.41%
PETRA 11.34% 16.92% 22.64% 9.45% 14.20% 19.72%

Spike amino-acid mutation prediction results for PETRA

We report average and weighted recall @1 and 10. In weighted measure, sequences are weighted by their representativeness.

Method Average Recall @1 @10 Weighted Recall @1 @10
Random Guess 0.01% 0.13% 0.01% 0.13%
Bloom 6.26% 12.63% 6.64% 13.08%
PETRA 17.84% 25.69% 17.10% 25.58%

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published