-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Homo-oligomer prediction? #8
Comments
Hi Heejong, We focused on heteromeric assemblies in this release, since homomers pose a different challenge. Nevertheless, you can predict homomers. We cannot distinguish intra- and inter-protein links in this case, therefore you would just define them as self-links:
If, however, you would like to only include them as inter-chain links, it gets a little more complicated. You would either need to replicate the features, say you have a homo-dimer, AlphaLink will generate A.feature.pkl.gz and A.uniprot.pkl.gz. You could copy them to B.feature.pkl.gz and B.uniprot.pkl.gz and adjust chains.txt from A A to A B. Now you can include the inter-chain links as
Or just ignore intra-chain links altogether by inserting here: https://github.com/Rappsilber-Laboratory/AlphaLink2/blob/main/unifold/dataset.py#L153
Note that you would need to run python setup.py install again afterwards to propagate the changes.
At the moment, you would need to adhere to the A,B,C,... naming scheme. Uni-Fold internally maps the sequence in order to A,B,C,... The final mapping can be found in "chain_id_map.json" in the output directory. What went wrong in your case? What would you prefer, just using the sequence id from the FASTA? The generic naming scheme makes it easier, esp., for homo-multimeric targets. Hope this helps, /edit updated the code snippet to conform with the recent update. |
Hi Kolija, For the part that I got error was more like naming scheme in filename. best, |
I fixed the handling of FASTA filenames with multiple underscores, which hopefully also resolves your issue. |
Awesome. Much appreciate it. |
Hi Kolja, I'm finally circling back to this matter. I'm actively testing the homodimer situation right now but, in the meantime, I got another more complex situation. What if you have 5 subunit complex, consisting of homodimer and homotrimer and they all interact each other? Thank you so much. best, |
How many links do you have per interaction? I usually just keep them, the network seems to be able to deal with it fairly well. If the results are bad, remove the homomeric links as suggested here: #8 (comment) |
Hi,
Thanks for releasing a fantastic package to the scientific community.
I just started testing with the example inputs to understand the input requirements and formats.
Here's my primary question:
During the test, I got stuck with how to format the input files, fasta and crosslinking data, for homodimer or homo-oligomer prediction. Have you tried or design the package for this type of cases?
And my side question is:
For the input, do I have to follow "A", "B", "C" naming scheme or I can be flexible on that? I tested a few different ways but none worked very well.
Thank you very much.
best,
heejong
The text was updated successfully, but these errors were encountered: