This repository contains the code for the analysis on the expression profile of the transportome in Cancer based on the MTP-DB.
Note
Read the preprint here: Profiling the Expression of Transportome Genes in cancer: A systematic approach
It's potentially out of date.
The project follows the Kerblam! standard.
You can run the analysis pipelines with Kerblam! and docker
:
# Clone the repo
git clone [email protected]:TCP-Lab/transportome_profiler.git
cd ./transportome_profiler
kerblam data fetch # Fetch the input data not present in the repository
kerblam run <pipeline>
Kerblam! will build docker containers and run the analysis locally. To run without docker, read below.
The project currently encompasses the following pipelines:
heatmaps
: Create large heatmaps from the expression matrices by using GSEA on computed gene rankings, testing all possible gene lists that can be made from theMTP-DB
.- The
test
profile makes this pipeline much faster by running on smaller (i.e. sampled) input data (~75% reduction in sample number, only 5000 random genes).
- The
You need some requirements to be installed before you can run the analysis locally:
R
version4.3.0+
.- Install R requirements with
./src/helper_scripts/install_r_pkgs.R
.
- Install R requirements with
Python
version3.11+
.- Install python requirements with
pip install -r ./src/requirements.txt
.
- Install python requirements with
- The
jq
utility (that you can find here). - The
xsv
program, required bymetasplit
(sudo pacman -Syu xsv
on Arch, not packaged by Debian, but this guide might be useful. If you havecargo
installed, you can simply runcargo install xsv
). - Follow the extra installation guide for
generanker
(namely installing fast-cohen) - The
xls2csv
utility (on archyay -Syu perl-xls2csv
) - A series of R packages that can be installed with
Rscript ./src/helper_scripts/install_R_pkgs.R
- Quite a bit of RAM (some steps require > 50 Gb of RAM) and time.
Override
N_THREADS
(withexport N_THREADS=...
) to run with less threads.
If you have all the requirements, you can simply:
kerblam run <pipeline> --local
Important
The manuscript for this project is also available online in this repository.