-
Notifications
You must be signed in to change notification settings - Fork 32
Preparing the input files
Two types of input files are required for running the RNA-seq workflow: the (compressed) FASTQ files with the reads, and a metadata text file containing any information about the samples.
The Snakefile
assumes that the FASTQ files are named according to the pattern <sample-name>.fastq.gz
(or <sample-name>_R1.fastq.gz
and <sample-name>_R2.fastq.gz
for paired-end data). If this is not the case you need to rename the files or modify the Snakefile
accordingly.
The metadata file should be a tab-separated text file, with at least two columns: one named names
, which contains all the values of <sample-name>
from the fastq files, and one named type
which is either SE or PE depending on whether the samples were obtained with a single-end or paired-end protocol. In addition, any number of columns can be included and used later in the analysis. All variables required for the differential expression analysis should be included as columns in the metadata text file.