finding ORIs in stranded SNS-seq

This repository contains scripts and files used to analyze stranded paired-end SNS-Seq experiments for Stanojcic et al in prep.

The scripts are made to run on a HPC under slurm (Roscoff Bioinformatics platform ABiMS (http://abims.sb-roscoff.fr)).

The first script qc_mapping.sh is for read quality control and mapping if you have already mapped reads you can directly start with the second script finding_ORIs.sh, that will used aligned reads from stranded SNS-seq as input.

The first script qc_mqpping.sh uses raw paired-end stranded sequencing reads from SNS-seq experiments and performs quality check, adapter and quality trimming and read mapping.

quality control of the provided fastq files (fastqc 0.11.9)
trimming of tail and adapter sequences (cutadapt 4.0)
quality trimming with q20 threshold (trimmomatic 0.39)
alignment against provided indexed genome (bowtie2 2.4.1)
sorting and conversion fom sam to bam of aligned reads (samtools 1.13)
check for insert size (picard 2.23.5)
removal of duplicated reads (picard 2.23.5)
bam file indexing (samtools 1.13)
generation of Multiqc report (multiqc 1.13)

The second script finding_ORIs.sh will use paired end aligend reads (sorted bam) as input and outputs bed files of the detected ORIs

seperation of mapped reads in minus and plus strand (samtools 1.13)
strand-seperated peak calling (macs2 2.2.7.1)
peak filtering (bedtools 2.2.0.0 and awk) based on three criteria:
- distance of minus and plus strand peaks
- no complete overlap between plus and minus strand peaks
- right strand orientation (minus followed by plus)

Parameters

Parameters for the qc_mqpping.sh script need to be provided in the config_mapping.txt file, that has to be in the same working directory as the script :

output directory name. The directory will be generated by the sript in the working directory.

output_dir=alignedReads
list of sample names as found in the fastq files e.g. $sample\L001_R1_001.fastq.gz

input_list=("x_S1_" "y_S2_" "z_S5_")
path to fastq directory

fastq_directory=path/to/fastqfiles
path and genome prefix of bowtie2 index and genome.fasta

genome=path/to/genome/myfavorite_Genome
genome prefix for filenames of mapped reads

genome_prefix=myfavorite_Genome

Parameters for the finding_ORIs.sh script need to be provided in the config_finding-ORIs.txt file, that has to be in the same working directory as the script:

path to aligned reads

read_directory=path/to/aligned_reads
output directory name (generated by the sript in the working directory)

output_dir=results/myresults
prefix of mapped reads ($sample$prefix.bam)

file_prefix=aln-pe_genomeprefix_sorted_reheadered_dups-removed
sample name as found in bam files

input_list=("x_S1_" "y_S2_" "z_S5_")
peak overlap to be excluded in percent

overlap_list=(50)
window size for peak pair selection in nucleotides

window_list=(500)

Run

The scripts are written to be run on an HPC cluster under slurm

  sbatch findingORIs/qc_mapping.sh

  sbatch findingORIs/finding_ORIS.sh

Authors

@bridlinbarckmann

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config_finding-ORIs.txt		config_finding-ORIs.txt
config_mapping.txt		config_mapping.txt
finding_ORIs.sh		finding_ORIs.sh
qc_mapping.sh		qc_mapping.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

finding ORIs in stranded SNS-seq

The first script qc_mqpping.sh uses raw paired-end stranded sequencing reads from SNS-seq experiments and performs quality check, adapter and quality trimming and read mapping.

The second script finding_ORIs.sh will use paired end aligend reads (sorted bam) as input and outputs bed files of the detected ORIs

Parameters

Parameters for the qc_mqpping.sh script need to be provided in the config_mapping.txt file, that has to be in the same working directory as the script :

Parameters for the finding_ORIs.sh script need to be provided in the config_finding-ORIs.txt file, that has to be in the same working directory as the script:

Run

Authors

About

Releases 1

Packages

Languages

License

bridlin/finding_ORIs

Folders and files

Latest commit

History

Repository files navigation

finding ORIs in stranded SNS-seq

The first script qc_mqpping.sh uses raw paired-end stranded sequencing reads from SNS-seq experiments and performs quality check, adapter and quality trimming and read mapping.

The second script finding_ORIs.sh will use paired end aligend reads (sorted bam) as input and outputs bed files of the detected ORIs

Parameters

Parameters for the qc_mqpping.sh script need to be provided in the config_mapping.txt file, that has to be in the same working directory as the script :

Parameters for the finding_ORIs.sh script need to be provided in the config_finding-ORIs.txt file, that has to be in the same working directory as the script:

Run

Authors

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages