Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Long time without activity "crash out on running mgf files from TIMS, and diagnostic peak inquiry #119

Open
derffff2022 opened this issue Oct 22, 2024 · 6 comments

Comments

@derffff2022
Copy link

Dear Xisearh team,

I really enjoyed using your software and the entire XiView platform in the past few years in crosslinking analysis. I also very much appreciate your continuous updates.

However, I recently faced a problem in analyzing mgf files from TIMS. At about 15% (9000 reads) processed, it freezes at "long time without activity" warning. This never shows up in my previous trials from Orbitrap raw data. The detailed transfer progress is according to a paper published this year: https://pubs.acs.org/doi/10.1021/acs.analchem.4c00829?goto=supporting-info
In the supporting information, they have shown the entire workflow, but instead of meroX, I would love to try on xiSearch. I have noticed that the mgf files generate from TIMS, the major difference is they included extra information on peaks charges. I am wondering if that eventually leads to the error.

Also, I have an additional question about the diagnostic peak as a "hidden" function, it seems like our algorithm disregard the intensity of the diagnostic peak. I am wondering in the future if we can make a threshold like only intensity above specific value will be considered.

Thank you again for your precious time and help!

Best,
Peter

@grandrea
Copy link
Contributor

grandrea commented Oct 24, 2024

hello,
You may try increasing the WATCHDOG parameter in your config, or the memory allocated to the search, and if possible reducing the number of proteins in the database, to see if this is a memory-related issue (which is what it sounds like to me). but @lutzfischer may have a better idea.

On the diagnostic peak, i am not sure what you mean by "our algorithm"... you meant "your algorithm"? In any case, there are 2 ways to look for diagnostic peaks in xiSEARCH. One is the diagnostic peak mining utility described at the bottom of the README, and the other, when you know what the peak you are looking for is, is the "reporterion:XXX" config entry. this will then annotate reporterions that you specify. in this case, using the --peaksout flag, you should get a dataframe that allows you to filter.

Alternatively, you can consider denoising prior to starting the search, or during the search, so as to take only the top N spectra in windows Y m/z wide.

@derffff2022
Copy link
Author

Hi,

The memory seems not a issue. I rellocate 30G for the analysis while only 1 protein is list in FASTA file. I dont think it would be the memory issue (as I do experience this when my FASTA contains more than 30 proteins). But I do notice that their mgf file contains extra information of charges annotation for every single peaks, which might cause the error.

For the reporter ion part, I appreciate your suggestions. In addition, it's actually 2 separate ions but representing 2 different crosslinkers (Yes we are trying to develop a new type of mutiplexed crosslinker). Though xisearch allows searching mutiple crosslinkers, it seems like I cant annotate which m/z ion specifically indicating the specific crosslinker I want.

Thanks again for your quick respond!

@grandrea
Copy link
Contributor

grandrea commented Oct 25, 2024

exciting!

I would then suggest

reporterions:123.45;67.90

in your config with the m/z values you expect. Check this table in readme https://github.com/Rappsilber-Laboratory/XiSearch?tab=readme-ov-file#full-options-for-configuration-in-text-config .

I have no experience with timstof sorry, but you can remove charge annotations in msconvert or with pyteomics if you think that is the issue- seems strange to me that it is that if the files are read and then there is a crash. Agreed that with a single protein... I would still increase WATCHDOG.

you can then get the table with the annotated peaks and intensities running xisearch with --peaksout

@derffff2022
Copy link
Author

For the TIMS converted data, I tried to all my best including to erase all the charge related info, but it still not working. I think it might be the best way to take a look at it directly by the expert...
@lutzfischer Do you have an email or something? I cannot attach my mgf file directly here.

@lutzfischer
Copy link
Member

Can you send me the whole log (lutz dot fischer at tu minus berlin dot de) ? If you can send me the mgf, config and fasta I can try to reproduce your problem.
As andrea suggested you can forward any peak intensity that you want by specifying the m/z as

reporterions:mz1;mz2

these will then be reported in the xiSEARCH result and should also automatically be forwarded by xiFDR to the CSM list.

@derffff2022
Copy link
Author

Can you send me the whole log (lutz dot fischer at tu minus berlin dot de) ? If you can send me the mgf, config and fasta I can try to reproduce your problem. As andrea suggested you can forward any peak intensity that you want by specifying the m/z as

reporterions:mz1;mz2

these will then be reported in the xiSEARCH result and should also automatically be forwarded by xiFDR to the CSM list.

Hey Lutz,
I am more than willing to share all my log config and mgf file here, but the zip size is too big. Do you have email so I can create a google drive or box folder that share specifically to you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants