DpnII-seq is an experimental assay to measure the genome wide cutting frequency of the restriction enzyme DpnII. Originally DpnII-seq was developed alongside Liquid Chromatin Hi-C to control for biases in cutting frequency when estimating chromatin contact stability in K562 cells. However, the DpnII-seq assay and analysis pipeline could be modified for analyzing cutting frequency for variable restriction enzymes in variable cell types. The DpnII-seq workflow performs mapping, filtering of artifacts, multiple resolution binning, copy number correction and plotting of quality metrics.
This workflow has options for digesting with the following restriction enzymes:
As K562 cells have a primarily triploid karyotype with regions of variable copy number, the analysis workflow corrects coverage tracks to a diploid state genome wide. If the user is applying DpnII-seq to cells with variable copy number states we provide scripts to correct for this bias using a Gaussian mixture model approach.
Install Snakemake via Miniconda here
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
conda config --add channels r
conda env create --file envs/DpnII-seq_env.yaml
source activate DpnII-seq
.
├── data
│ ├── binning
│ ├── copy_number
│ ├── dpnII_sites
│ ├── fastq
│ ├── fatI_sites
│ ├── hindIII_sites
│ └── indexes
├── DpnII-seq
DpnII-seq workflow requires a data directory with the following files:
- binning/
- bedfiles denoting binned genome (ex. hg19_40kb.bed)
- copy_number/
- bedfiles denoting binned genome with copy number state (4th column)
- [X]_sites/
- bedfiles denoting restriction sites of restriction enzyme X for each chromosome (ex. dpnII_sites_chr1.bed)
- fastq/
- paired end fastq files (*_R1.fastq, *_R2.fastq)
- indexes/
- bowtie-build index files for reference genome (*.ebwt)
Currently the DpnII-seq workflow has only been tested in an LSF cluster environment
snakemake -j 10 --latency-wait 60 --cluster-config cluster.json --cluster "bsub -q {cluster.queue} -W {cluster.time} -R {cluster.memory} -n {cluster.cores} -o {cluster.output} -e {cluster.error}" -p
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome!
Houda Belaghzal*, Tyler Borrman*, Andrew D. Stephens, Denis L. Lafontaine, Sergey V. Venev, Zhiping Weng, John F. Marko, Job Dekker. (2019). Compartment-dependent chromatin interaction dynamics revealed by liquid chromatin Hi-C. bioRxiv