This GitHub repository contains designs and their evaluations created during EGFR binder design at Adaptyv Bio. Additionally, I'm sharing some useful software and blogs that were helpful during the competition.
- Submit and wet validation design visualized by hgbrian/pdb2png, visualization of EGFR_ECOD_l110_s478327_mpnn7_model2
:: 👀 Adaptyv Bio Protein Design Competition home
:: 👀 EGFR Binders (Round2)
:: 👀 All Results by AdaptyvBio
:: 👀 My Bind Design
- EGFR_l88_s11832_mpnn1_model2 AdaptyvBio results
Two-Stage Screening considering different archtectures between AlphaFold2 and AlphaFold3 in De Novo Design
I generated over 350 de novo designs using BindCraft, and RSO(corey's biomodal implementation). For the BindCraft designs, I explored various conditions, including targeting different regions of Domain III and generating designs without specified hotspots. The designs underwent a two-stage screening process using ColabFold and the AlphaFold3 server. While some designs achieved scores predicted to be within the top 100 of the leaderboard in ColabFold evaluation, they sometimes exhibited lower scores in AlphaFold3 predictions. (For example, design "EGFR_l170_s387368_mpnn7_model2" shows an AF2 ipTM of 0.89 but an AF3 ipTM of only 0.17, and some designs show a decrease from AF2 ipTM values to AF3 ipTM values ranging from -1.0 to -2.0).
This discrepancy may be attributed to AlphaFold3's diffusion module and modifications to its loss function. AlphaFold2 predicts residue-based coordinates and penalizes structural violations. At binding interfaces, AlphaFold2's residue-based evaluation focuses on overall backbone placement, which might not fully capture local atomic interactions such as side-chain packing and electrostatic interactions that are important for binding affinity. These interface issues, which are crucial for actual protein-protein interaction and binding specificity, might go undetected in residue-based evaluation.
In contrast, AlphaFold3's all-atom diffusion model directly predicts atomic coordinates without such constraints, potentially revealing these subtle instabilities at the interface that could affect binding. Designs that score well in both systems have passed two different types of evaluation criteria - AlphaFold2's residue-level assessment and AlphaFold3's atomic-level prediction - which may increase confidence in their structural predictions.
Although AlphaFold3 is not currently open-source, making this analysis somewhat speculative, if Figures 1-3 in the AlphaFold3 paper are accurate, it would be excellent for evaluating de novo designs. Consequently, designs showing strong scores in both ColabFold and AlphaFold3 may indicate greater stability and binding potential. While AlphaFold3 limits submissions to 20 jobs per day, this was sufficient for conducting two-stage screening (except for EGFR_l61_s456546_mpnn1_model1 and EGFR_l62_s814650_mpnn14_model1).
Note: AlphaFold3 inference source code is currently opened here.
📊 two-stage screening
(colabfold + alphafold3, AF2 ipTM 0.89~0.93 → AF3 ipTM 0.89~0.93)
- EGFR_l114_s689302_mpnn1_model1
- EGFR_l137_s922133_mpnn6_model2
- EGFR_l147_s449248_mpnn1_model2
- EGFR_l110_s478327_mpnn7_model2
- EGFR_l164_s996609_mpnn9_model2
- EGFR_l110_s478327_mpnn6_model1
- EGFR_l84_s528582_mpnn7_model1
- EGFR_l88_s11832_mpnn1_model2
---
📊 only ColabFold (Alphafold2 + MMseqs2) screening
(AF2 ipTM 0.91~0.93 → AF3 ipTM 0.82~0.85)
- EGFR_l61_s456546_mpnn1_model1
- EGFR_l62_s814650_mpnn14_model1
---
📊 wet validation (3 designs are selected)
- EGFR_l61_s456546_mpnn1_model1
- EGFR_l110_s478327_mpnn7_model2
- EGFR_l88_s11832_mpnn1_model2
All my designs (BindCraft) and their evaluation results can be accessed at:
https://wdmr2dwp.nocodb.com/#/nc/view/48433702-4717-474f-8c66-49846d71e4b8
:: Overview of all my designs and their integrated scores (iPAE, ipTM, and ESM2 log-likelihood).
Here is helpful information about Adaptyv Bio's protein design competition and protein design in general. For those new to protein binder design, I recommend checking out the starred (⭐️) GitHub repositories and blog posts first.
- Methods for binder designs
- ⭐️ BindCraft, User friendly and accurate binder design pipeline
- biomodals, bioinformatics tools running on modal
- RSO, coreyhowe999's modal implementation
- chai-1, SOTA model for biomolecular structure prediction
- Boltz-1: Democratizing Biomolecular Interaction Modeling
- ESM-3, frontier generative model for biology
- ESM-2, Evolutionary Scale Modeling
- ⭐️ ColabFold, Making Protein folding accessible to all!
- RFDiffusion, OSS structure generation methods
- dl binder design, de novo binder designs with DL
- ProteinMPNN, structure to sequence
- pLMs and pLLMs
- Competition Results
- Jude Wells@_judewells Chai1 optimization
- Corey@design_proteins tried BindCraft
- Anthony Gitter@anthonygitter shared additional wet validation results
- Martin Pacesa@MartinPacesa disucuss round2 competition metrics
- GitHub Repo
- ⭐️ Official Blog