Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

homodimeric protein complexes #34

Open
jimfeng9705 opened this issue Nov 19, 2024 · 24 comments
Open

homodimeric protein complexes #34

jimfeng9705 opened this issue Nov 19, 2024 · 24 comments

Comments

@jimfeng9705
Copy link

Regarding AlphaLink 2 modeling of homodimeric protein and its complex with other protein, the inherent ambiguity of self-links in homo-multimeric assemblies poses challenges in distinguishing inter- from intra-chain restraints. For example, for a A2B2 protein complex (with protein A2 and other protein B2), the crosslink file actually contains 4 times the observed crosslinks. Is there any development to address this situation?

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 19, 2024

Not on the model level but you can now specify directly which crosslinks you want to use for modelling, e.g., only inter-chain restraints by omitting the intra-chain links. This is what I usually do in practice, actual intra-chain restraints are just additional noise in this scenario.

@jimfeng9705
Copy link
Author

How to omit the intra-chain links in a Local AlphaLink 2 run?

Thanks,

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 19, 2024

The newest version requires that you specify links for each chain. Homomeric chains are now handled separately, e.g., A2B2 becomes A B C D. If you only include links between A C, A D, B C, B D you omit homomeric links.

@jimfeng9705
Copy link
Author

How can we export the crosslinks within XiView to specify the chains separately? Do we need to update the local AlphaLink 2?

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 19, 2024

Should work if you deselect self-links prior to exporting but I'm not super familiar with XiView.

@jimfeng9705
Copy link
Author

jimfeng9705 commented Nov 19, 2024 via email

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 19, 2024

Yes, if you build it before June this year.

@jimfeng9705
Copy link
Author

Regarding "If you only include links between A C, A D, B C, B D you omit homomeric links" in a A2B2 complex:

Shall I omit A B, C D? Can you clarify homomeric links?

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 21, 2024

Sorry, that was actually wrong, I confused myself.

So homomeric links are the links you mentioned in the beginning, between homomers where inter- and intra-chain crosslinks are hard to distinguish. Before, we always included inter- and intra-chain links. Now with the recent version, you have more control.

The A2 example is split into two distinct chains A B. If you want the old behaviour, you need to include crosslinks between A A, B B (both intra-chain) and A B. If you only want to have inter-chain links, you omit the links between A A and B B. Hope this helps. Sorry for the confusion. You should double check that deselecting self-links in XiView removes the A A links.

@jimfeng9705
Copy link
Author

jimfeng9705 commented Nov 21, 2024 via email

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 21, 2024

Probably doesn't hurt

@jimfeng9705
Copy link
Author

Is it better to omit the links of A A and B B (to have less noise)?

How to use template in AlphaLink 2?

Thanks!

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 21, 2024

Templates will be used by default. I usually omit intra-chain links.

@jimfeng9705
Copy link
Author

Would FDR number (0.05 vs. 0.2, or 0.01) affect the modelling results?

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 22, 2024

Somewhat, but probably not too much. I would start with 0.2.

@jimfeng9705
Copy link
Author

Is the FDR number actual number used in crosslinks identification in e.g.. XLink? Or is it artificial?

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 25, 2024

It relates to the actual number.

@jimfeng9705
Copy link
Author

jimfeng9705 commented Nov 25, 2024

In XLink, a typical FDR is 1%, i.e., 0.01. The FDR number in your sulfo-SDA sample data is 0.2, which might be too 'loose'? Did I miss anything?

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 25, 2024

We usually work with 5-20% for SDA data. This doesn't pose a problem to the network since it generally handles noise well. The network was mostly trained on 20% data. Due to the ambiguity of the inter-/ intra-chain links, your data will be effectively much more noisy.

@grandrea
Copy link

Hi,

I would like to add that this refers to the FDR at the residue pair level, not the spectral match level that some software provides instead (see https://github.com/Rappsilber-Laboratory/xiFDR and the articles cited therein for more details).

The other point you should think about is that this is not only a chance of false identification, but a representation of the "FDR" as it is in the modeling, i.e. that a link is overlength, basically. So even if your data is at 5%, because of protein side chain and backbone flexibility, you do tend to have more than 5% distance violation when plotted on structure. Hence the looser values. The program will tend to pay more "attention" to enforcing restraints for links with a lower FDR. So a way to simulate an OR restraint statement between homomers is indeed to increase the FDR as advised above.

When using AL2, I typically do 10% for data identified at a 5% residue pair level FDR, although I have not benchmarked this.

@jimfeng9705
Copy link
Author

For a homodimeric protein and its complex, which optional parameters are better?

The default parameters take a long time, while 3 1 42 1 set seems to give even better results. Advice?

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 27, 2024

If the results are good enough with 3 recycling iterations, I would stick with it and only generate more samples, i.e., increasing the second number from 1 to 10 or 25. For more diversity, you can also lower the third number, e.g., to 25 and see how well it works.

@jimfeng9705
Copy link
Author

For the 4th parameter, shall I use full MSAs for the crosslinked residues (with default value of -1), or remove the MSA information for the crosslinked residues (with 1)? Benefits/purposes?

@lhatsk
Copy link
Collaborator

lhatsk commented Nov 27, 2024

Start with -1 and if the model is not good enough/ what you expect/ crosslink satisfaction is too low, try activating the option. It might boost the crosslinks if the MSA overpowers them otherwise. These options are just different ways to introduce diversity and increase exploration / shift weights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants