Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Model] ProSST (NeurIPS2024) #49

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

tyang816
Copy link
Contributor

@tyang816 tyang816 commented Oct 3, 2024

Hi, Pascal

My vscode seems to have automatically closed the previous PR... but this is not a big problem.

I added structure tokenization in the new code according to @fteufel. And our paper ProSST: Protein Language Modeling with Quantized Structure and Disentangled Attention has been accepted by NeurIPS 2024.

My previous PR:

We recently developed a new structure-sequence model called **ProSST** which gets a weighted average score of **0.504** at K=2048 on the substitution benchmark. 

1. Added the scoring code of *prosst* in `proteingym/baselines/prosst`.
2. Added *prosst* information to `config.json` and `constants.json`.
3. Added scoring method and [Huggingface Path](https://huggingface.co/AI4Protein) in scripts, `scripts/scoring_DMS_zero_shot/scoring_ProSST_substitutions.sh`.

It is worth noting:
1. A new conda environment is required, the configuration file is in `proteingym/baselines/prosst/environment.yaml` (same as **ProtSSN**).
2. The original PDB files are same with **ProtSSN**, you can [download](https://lianglab.sjtu.edu.cn/files/ProtSSN-2024/proteingym_v1_pdb.tar.gz) them here. The processed sequence and structure files can be [downloaded](https://drive.google.com/file/d/1lSckfPlx7FhzK1FX7EtmmXUOrdiMRerY/view?usp=sharing) here.
3. More model information can be seen in [ProSST repository](https://github.com/ai4protein/ProSST).

Thanks again for your community efforts, and we will continue to pay attention to your group's work.

@pascalnotin
Copy link
Contributor

@tyang816 -- thank you for the updates, and congrats on your acceptance at NeurIPS!
@fteufel has been looking into your PR (thanks again!) and we should be able to merge it in the main branch soon.
Best regards,
Pascal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants