Skip to content

MS MetFrag XCMS Workflow

c-ruttkies edited this page Jul 27, 2017 · 19 revisions

To try the workflow, please download and unzip first this data set to your own machine (click on the download button to the upper right of your browser window). Once you have that data unzipped, please follow the video tutorial MetFrag MS Workflow from a fresh VRE deployment, which will explain how to use the different pieces of data on the workflow.

General

Metabolite identification in clinical studies is a crucial step when trying to understand e.g. the courses of a disease on the metabolomic level. The MetFrag workflow goes a first step into this direction as it annotates molecules from compound (metabolite) databases to MS/MS (tandem mass spectrometry) spectra. This annotation is based on the mapping of in silico generated fragments to the experimental spectra and scoring of these mappings based on different criteria.

The workflow consists of different steps that include the pre-processing of the data using XCMS, MSnbase and CAMERA used to read the data from a given mzML file and to detect and annotate features. Given this annotation MetFrag parameter sets are generated that are passed to the MetFrag CLI Batch tool performing the actual processing that includes the annotation of molecular structures to the data. In the following the single steps will be described in detail.

Pre-Processing

XCMS/MSnbase

The pre-processing starts with reading the peak information from a mzML file uploaded to the Galaxy history. This step on the one hand is performed by the module xcms-find-peaks and generates a rdata file storing a XCMS-Set object with the peak data consisting usually of retention time, mass-to-charge (m/z) ratio and intensity. On the other hand, the module ""msnbase-read-msms"" is used to retrieve the MS/MS spectra from a given mzML file. This can be a second mzML file apart from that used for the XCMS node or the same in case it contains both MS and MS/MS information. Further steps could be applied e.g. peak grouping and retention time correction over serveral mzML files from different samples. As this workflow only aims in processing one mzML file from a single experiment these steps are not needed for the moment, whereas the modules are already available (xcms-group-peaks, xcms-correct-rt).

CAMERA

A second pre-processing step is performed by using CAMERA that groups peaks within a sample based on their retention time and intensity profile. Furthermore, the grouping regards information isotopologues and adducts, information that is usually acquired via mass spectrometry. The CAMERA annotation results in so called pseudo spectra where in the ideal each spectrum contains peaks from one single metabolite. For the further MetFrag processing the adduct annotation step using camera-find-adducts is important as it is used to determine the monoisotopic masses of the precursor m/z features used to query molecules from compound databases.

Clone this wiki locally