- Cristian Gonzalez-Colin ([email protected])
- Vijayanand Lab (https://www.lji.org/labs/vijayanand/)
- La Jolla Institute for Immunology (LJI)
- La Jolla, CA USA
- Current version: (07/17/2023)
This pipeline was developd for eQTL calling for the DICE Tissue project (unpublished). It was implemented using Snakemake v7.14.0 workflow manager. Cluster configuration file (cluster.json) needs to be modified according to the cluster/cloud enviroment to work properly.
Linear models are fitted using MatrixeQTL and two different multiple correction methods are included: eigenMT and a permutation-based method. The last one requires more computational resources and time to evaluate 1,000 (default) permutations.
In order to properly run the pipeline three environment are provided. Snakemake would automatically set the environments to run specific steps.
- DLCP.yaml
- pyEigenMT.yaml
- bcftools.yaml
Configuration file snake_conf.yaml
has to be in the same folder as the Snakefile
. Make proper changes to it. Files needed are explained below.
Tab separated file with donors in the columns and covariates in the rows. This is a general covariates file, PEER Factors and PCs will be added within the pipeline.
Tab separated file with the location of genotype files in matrix eQTL format:
CHR | SNP | SNP_LOC |
---|---|---|
1 | SNPFILE.txt | SNPLOC.txt |
2 | SNPFILE.txt | SNPLOC.txt |
Both files need to be prepared followed matrix eQTL toy dataset. SNPFILE.txt & SNPLOC.txt
Pipeline runs multiple clusters per cell in parallel. To do that a data set .csv file is needed with the following columns.
cell | tissue | subset | expFile | donorFile |
---|---|---|---|---|
CD4 | Lung | 0 | file1.txt | donors1.txt |
CD4 | Lung | 1 | file2.txt | donors2.txt |
CD8 | Lung | 0 | file3.txt | donors3.txt |
- expFile: Tab separated file with expression data to use for a given cluster in a cell type. Donors in columns and genes in rows. Name of donors and genes are needed.
- donorFile: List of donors to use in the analysis. Needs to be a set of all donors.
File was prepared following matrix eQTL toy dataset.
The bin/
folder needs to be copy to the working directory.
The pbs_submit.sh
file shows and example of how the pipeline was run.
Please cite the following manuscript if you are using this repository:
Please email Cristian Gonzalez-Colin ([email protected]) and/or Vijayanand Pandurangan ([email protected]).