Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does the identified PrSM decrease a lot after using the modified file? #18

Open
sunyusui opened this issue Jan 26, 2019 · 1 comment
Labels

Comments

@sunyusui
Copy link

The histone H3.1 I used contains a total of 3460 spectra, and the database human_proteome_database.fasta contains 20410 entries.
When I did not use the modified file, the result output 1310 PrSM, the parameters are as follows:
SpecFile 2DLC_H3_1.pbf
DatabaseFile human_proteome_database.fasta
FeatureFile 2DLC_H3_1.ms1ft
InternalCleavageMode SingleInternalCleavage
Tag-based search True
Tda Target+Decoy
PrecursorIonTolerancePpm 10
ProductIonTolerancePpm 10
MinSequenceLength 21
MaxSequenceLength 300
MinPrecursorIonCharge 2
MaxPrecursorIonCharge 30
MinProductIonCharge 1
MaxProductIonCharge 20
MinSequenceMass 3000
MaxSequenceMass 50000
ActivationMethod Unknown
MaxDynamicModificationsPerSequence 0

When I use the modified file, only 59 PrSMs are output, and the parameters are as follows:
SpecFile 2DLC_H3_1.pbf
DatabaseFile human_proteome_database.fasta
FeatureFile 2DLC_H3_1.ms1ft
InternalCleavageMode SingleInternalCleavage
Tag-based search True
Tda Target+Decoy
PrecursorIonTolerancePpm 10
ProductIonTolerancePpm 10
MinSequenceLength 21
MaxSequenceLength 500
MinPrecursorIonCharge 2
MaxPrecursorIonCharge 50
MinProductIonCharge 1
MaxProductIonCharge 20
MinSequenceMass 3000
MaxSequenceMass 50000
ActivationMethod Unknown
MaxDynamicModificationsPerSequence 4
Modification C(2) H(2) N(0) O(1) S(0),R,opt,Everywhere,Acetyl
Modification C(2) H(2) N(0) O(1) S(0),K,opt,Everywhere,Acetyl
Modification C(1) H(2) N(0) O(0) S(0),R,opt,Everywhere,Methyl
Modification C(1) H(2) N(0) O(0) S(0),K,opt,Everywhere,Methyl
Modification C(2) H(4) N(0) O(0) S(0),R,opt,Everywhere,Dimethyl
Modification C(2) H(4) N(0) O(0) S(0),K,opt,Everywhere,Dimethyl
Modification C(3) H(6) N(0) O(0) S(0),R,opt,Everywhere,Trimethyl
Modification C(0) H(1) N(0) O(3) S(0) P(1),S,opt,Everywhere,Phospho
Modification C(0) H(1) N(0) O(3) S(0) P(1),T,opt,Everywhere,Phospho
Modification C(0) H(1) N(0) O(3) S(0) P(1),Y,opt,Everywhere,Phospho

The modification file I am using is as follows:

This file is used to specify modifications for MSPathFinder

Max Number of Modifications per peptide

NumMods=4

Static mods

None

Dynamic mods

C2H2O1,RK,opt,any,Acetyl # Acetylation RK
CH2,RK,opt,any,Methyl # Methylation RK
C2H4,RK,opt,any,Dimethyl
C3H6,R,opt,any,Trimethyl
HO3P,STY,opt,any,Phospho # Phosphorylation STY

Is there a problem with my parameter settings, which leads to this situation?
Is there a normal event, only 59 prsm can be identified for such input?

@alchemistmatt
Copy link
Member

I suspect the issue is too many dynamic modifications on the same residues, which leads to too many possible peptides to score. I suggest searching for those modifications separately.

Search 1: C2H2O1,RK,opt,any,Acetyl # Acetylation RK
Search 2: CH2,RK,opt,any,Methyl # Methylation RK
Search 3: C2H4,RK,opt,any,Dimethyl
Search 4: C3H6,R,opt,any,Trimethyl
Search 5: HO3P,STY,opt,any,Phospho # Phosphorylation STY

If you get lots of results form Search 3 and Search 5 (for example), you could try combining them to give Search 6:
C2H4,RK,opt,any,Dimethyl
HO3P,STY,opt,any,Phospho # Phosphorylation STY

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants