How to embed MHC #2

JingqiZhang1102 · 2024-05-24T19:46:26Z

Hello, I have a question regarding the embeddings of MHC. I tried the following command

python3 esm/ecripts/extract.py esm1v_t33_650M_UR90S_1 /path/to/fasta_file.fasta /path/to/pt_files --repr_layers 33 --include mean

and got KeyError: '*01'.

I assume that the model esm1v_t33_650M_UR90S_1 may not be able to handle characters such as *. But based on 4_VDJDB_trainESMmodel.ipynb, mhclist contains MHC A information, and you were able to compute the embeddings of MHC. Could you elaborate on which model(s) have been used? Thank you in advance!

The text was updated successfully, but these errors were encountered:

JingqiZhang1102 · 2024-06-05T23:14:24Z

We found a website to get protein sequence with MHC alleles. Is this potentially how you get the MHC sequence to embed?
https://www.ebi.ac.uk/ipd/imgt/hla/alleles/

xinformatics · 2024-06-11T04:57:41Z

Hi @JingqiZhang1102, you are correct. We used the MHC sequences from EBI and prepared a cleaned-up version of the fasta files with correct HLA nomenclature. For embeddings the ESM1v model was used.

hope it helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to embed MHC #2

How to embed MHC #2

JingqiZhang1102 commented May 24, 2024

JingqiZhang1102 commented Jun 5, 2024

xinformatics commented Jun 11, 2024

How to embed MHC #2

How to embed MHC #2

Comments

JingqiZhang1102 commented May 24, 2024

JingqiZhang1102 commented Jun 5, 2024

xinformatics commented Jun 11, 2024