Merge pull request #5433 from galaxyproject/warthog-pomano

Add metaproteomics per request
galaxyproject · Oct 12, 2024 · 65ae7df · 65ae7df
2 parents f71ea7d + d5c2004
commit 65ae7df
Show file tree

Hide file tree

Showing 11 changed files with 18 additions and 9 deletions.
diff --git a/topics/microbiome/metadata.yaml b/topics/microbiome/metadata.yaml
@@ -44,3 +44,6 @@ subtopics:
   - id: metatranscriptomics
     title: "Metatranscriptomics"
     description: "Taxonomic and functional characterisation of mixed samples using transcriptome data."
+  - id: clinical-metaproteomics
+    title: "Metaproteomics"
+    description: "These tutorials are step by step analysis from database generation to the discovery of peptides to verification, quantitation, and interpretation of the results."
diff --git a/topics/microbiome/tutorials/clinical-mp-1-database-generation b/topics/microbiome/tutorials/clinical-mp-1-database-generation
@@ -0,0 +1 @@
+../../proteomics/tutorials/clinical-mp-1-database-generation
diff --git a/topics/microbiome/tutorials/clinical-mp-2-discovery b/topics/microbiome/tutorials/clinical-mp-2-discovery
@@ -0,0 +1 @@
+../../proteomics/tutorials/clinical-mp-2-discovery/
diff --git a/topics/microbiome/tutorials/clinical-mp-3-verification b/topics/microbiome/tutorials/clinical-mp-3-verification
@@ -0,0 +1 @@
+../../proteomics/tutorials/clinical-mp-3-verification
diff --git a/topics/microbiome/tutorials/clinical-mp-4-quantitation b/topics/microbiome/tutorials/clinical-mp-4-quantitation
@@ -0,0 +1 @@
+../../proteomics/tutorials/clinical-mp-4-quantitation
diff --git a/topics/microbiome/tutorials/clinical-mp-5-data-interpretation b/topics/microbiome/tutorials/clinical-mp-5-data-interpretation
@@ -0,0 +1 @@
+../../proteomics/tutorials/clinical-mp-5-data-interpretation
diff --git a/topics/proteomics/tutorials/clinical-mp-1-database-generation/tutorial.md b/topics/proteomics/tutorials/clinical-mp-1-database-generation/tutorial.md
@@ -52,13 +52,13 @@ Metaproteomics is the large-scale characterization of the entire complement of p
 
 To address this, we used tandem mass spectrometry (MS/MS) and bioinformatics tools on the Galaxy platform to develop a metaproteomics workflow to characterize the metaproteomes of clinical samples. This clinical metaproteomics workflow holds potential for general clinical applications such as potential secondary infections during COVID-19 infection, microbiome changes during cystic fibrosis as well as broad research questions regarding host-microbe interactions.
 
-![Clinical Metaproteomics workflow](../../images/clinical-mp/clinical-mp-overview.JPG)
+![Clinical Metaproteomics workflow]({% link topics/proteomics/images/clinical-mp/clinical-mp-overview.JPG %})
 
 
 The first workflow for the clinical metaproteomics data analysis is the Database generation workflow. The Galaxy-P team has developed a workflow wherein a large database is generated by downloading protein sequences of known disease-causing microorganisms and then generating a compact database from the comprehensive database using the Metanovo tool.
 
 
-![Database Generation Workflow](../../images/clinical-mp/clinical-mp-database-generation.JPG)
+![Database Generation Workflow]({% link topics/proteomics/images/clinical-mp/clinical-mp-database-generation.JPG %})
 
 
 

diff --git a/topics/proteomics/tutorials/clinical-mp-2-discovery/tutorial.md b/topics/proteomics/tutorials/clinical-mp-2-discovery/tutorial.md
@@ -58,7 +58,7 @@ This tutorial can be followed with any user-defined database but would work best
 The MSMS data will be searched against the compact database `Human UniProt Microbial Proteins (from MetaNovo) and cRAP` to identify peptide and protein sequences via sequence database searching. For this tutorial, two peptide identification programs will be used: SearchGUI/PeptideShaker and MaxQuant. However, you could use other software too, such as Fragpipe or Scribe. For the purpose of this tutorial, a dataset of the 4 RAW/MGF files will be used as the MS/MS input.
 
 
-![Discovery Workflow](../../images/clinical-mp/clinical-mp-discovery.JPG)
+![Discovery Workflow]({% link topics/proteomics/images/clinical-mp/clinical-mp-discovery.JPG %})
 
 
 > <agenda-title></agenda-title>

diff --git a/topics/proteomics/tutorials/clinical-mp-3-verification/tutorial.md b/topics/proteomics/tutorials/clinical-mp-3-verification/tutorial.md
@@ -56,9 +56,9 @@ The PepQuery tool is used to validate the identified microbial peptides from Sea
 
 Interestingly, the PepQuery tool does not rely on searching peptides against a reference protein sequence database as “traditional” shotgun proteomics does, which enables it to identify novel, disease-specific sequences with sensitivity and specificity in its protein validation (Figure A). Then we extract microbial protein sequences that are assigned to the PepQuery verified peptides. To this, we again add the Human UniProt Reference proteome (with Isoforms) and cRAP databases for creating a database for quantitation purposes (Figure B).
 
-![Peptide Verification](../../images/clinical-mp/clinical-mp-verification-1.JPG)
+![Peptide Verification]({% link topics/proteomics/images/clinical-mp/clinical-mp-verification-1.JPG %})
 
-![Database generation from verified peptides](../../images/clinical-mp/clinical-mp-verification-2.JPG)
+![Database generation from verified peptides]({% link topics/proteomics/images/clinical-mp/clinical-mp-verification-2.JPG %})
 
 
 > <agenda-title></agenda-title>

diff --git a/topics/proteomics/tutorials/clinical-mp-4-quantitation/tutorial.md b/topics/proteomics/tutorials/clinical-mp-4-quantitation/tutorial.md
@@ -51,7 +51,7 @@ The next step of the clinical metaproteomics workflow is the quantification work
 
 In this current workflow, we perform Quantification using the MaxQuant tool and the output will be interpreted in our next module.
 
-![Quantitation workflow](../../images/clinical-mp/clinical-mp-quantification.JPG)
+![Quantitation workflow]({% link topics/proteomics/images/clinical-mp/clinical-mp-quantification.JPG %})
 
 
 

diff --git a/topics/proteomics/tutorials/clinical-mp-5-data-interpretation/tutorial.md b/topics/proteomics/tutorials/clinical-mp-5-data-interpretation/tutorial.md
@@ -50,7 +50,8 @@ recordings:
 
 The final workflow in the array of clinical metaproteomics tutorials is the data interpretation workflow. Interpreting MaxQuant data using MSstats involves applying a rigorous statistical framework to glean meaningful insights from quantitative proteomic datasets. The MaxQuant output is explored to understand data distribution and variability. Subsequent normalization helps account for systematic variations. MSstats allows the user to define the experimental design, including sample groups and conditions, to perform statistical analysis. The output provides valuable information about differential protein expression across conditions, estimates of fold changes, and associated p-values, aiding in the identification of biologically significant proteins. Furthermore, MSstats enables quality control and data visualization, ultimately enhancing our ability to draw meaningful conclusions from complex proteomic datasets. Additional tutorial material for using MaxQuant and MSstatTMT for TMT data analysis can be found at [MaxQuant and MSstats for the analysis of TMT data](https://gxy.io/GTN:T00220).
 
-![Data-Interpretation-workflow](../../images/clinical-mp/clinical-mp-data-interpretation.JPG)
+![Data-Interpretation-workflow]({% link topics/proteomics/images/clinical-mp/clinical-mp-data-interpretation.JPG %})
+
 > <agenda-title></agenda-title>
 >
 > In this tutorial, we will cover:
@@ -131,7 +132,7 @@ Unipept serves as a vital bioinformatics platform for the analysis of mass spect
 >
 {: .hands_on}
 
-![Data-Interpretation with Unipept](../../images/clinical-mp/clinical-mp-data-interpretation-figure2.jpg)
+![Data-Interpretation with Unipept]({% link topics/proteomics/images/clinical-mp/clinical-mp-data-interpretation-figure2.jpg %})
 
 ## Extraction of Microbial Sequences
 
@@ -222,7 +223,7 @@ MSstats TMT(Tandem Mass Tag) is a computational tool designed for the robust sta
 The MSstats output typically includes essential information such as estimated fold changes, p-values, and other statistical measures that help identify differentially expressed proteins across experimental conditions or sample groups. It provides a clear picture of the variations in protein expression levels, aiding in the prioritization of biologically relevant targets. MSstats output also often includes visualizations and quality control metrics, making it a valuable resource for researchers in their quest to extract meaningful insights from complex proteomic datasets and understand the underlying biology of their experiments.
 Example of our data interpretation:
 
-![Data-Interpretation results with MSstats](../../images/clinical-mp/clinical-mp-data-interpretation-figure3.jpg)
+![Data-Interpretation results with MSstats]({% link topics/proteomics/images/clinical-mp/clinical-mp-data-interpretation-figure3.jpg %})
 
 
 # Conclusion