diff --git a/DESCRIPTION b/DESCRIPTION index 7fcf7fa..d74db8a 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: ConsensusPeak Title: Call consensus peaks from multiple biological replicates -Version: 0.0.0.9100 +Version: 0.0.0.9000 Authors@R: person(given = "Thomas", family = "Roberts", email = "tomroberts.work15@gmail.com", diff --git a/R/idr_analysis.R b/R/idr_analysis.R index 1c884aa..56c96b8 100644 --- a/R/idr_analysis.R +++ b/R/idr_analysis.R @@ -1,7 +1,8 @@ #' Call consensus peaks using IDR #' -#' \code{idr_analysis()} calls consensus peaks using IDR thresholding. IDR -#' analysis can be used to generate a "conservative" or "optimal" set of peaks. +#' \code{idr_analysis()} generates consensus peaks using IDR thresholding. The +#' functions calls peaks with MACSr and then filters these peaks using IDR. IDR +#' can be used to generate a "conservative" or "optimal" set of peaks. #' #' @param treat_files Character vector containing paths to the treatment BAM #' files. @@ -35,6 +36,7 @@ #' control_files = NULL, #' type = "all", #' is_paired = FALSE, +#' keep_original = FALSE, #' out_dir = tempdir() #' ) #' } diff --git a/R/multiple_replicates_chipr.R b/R/multiple_replicates_chipr.R index 1041dad..4b9b064 100644 --- a/R/multiple_replicates_chipr.R +++ b/R/multiple_replicates_chipr.R @@ -1,11 +1,12 @@ -#' Call peaks with MACSr and generate consensus set with ChIP-R +#' Multiple replicate peak calling with ChIP-R #' #' \code{multiple_replicates_chipr()} is a wrapper of the Python package ChIP-R. -#' ChIP-R handles an arbitrary number of replicates. +#' The function calls peaks with MACSr and then filters these peaks to generate +#' a consensus set using the Python package ChIP-R. #' #' @inheritParams idr_analysis -#' @param subdir_name Character specifying the name of the subdirectory that the -#' output files will be placed. +#' @param subdir_name The name of the subdirectory that the output files will be +#' placed. #' @inheritParams run_chipr #' @inheritDotParams MACSr::callpeak -tfile -cfile -outdir -name -format -log #' -tempdir @@ -20,8 +21,7 @@ #' input3 <- testthat::test_path("testdata", "r3_test_creb.bam") #' #' multiple_replicates_chipr(treat_files = c(input1, input2, input3), -#' out_dir = tempdir(), -#' ... +#' out_dir = tempdir() #' ) #' } #' diff --git a/R/multiple_replicates_mspc.R b/R/multiple_replicates_mspc.R index afab22b..7ff1da2 100644 --- a/R/multiple_replicates_mspc.R +++ b/R/multiple_replicates_mspc.R @@ -1,12 +1,12 @@ #' Multiple replicate peak calling with MSPC #' #' \code{multiple_replicates_mspc()} runs multiple sample peak calling from the -#' rmspc package. The function handles an arbitrary number of biological -#' replicates. +#' rmspc package. The function calls peaks with MACSr and then filters these +#' peaks to generate a consensus set using the mspc function from rmspc. #' #' @inheritParams idr_analysis -#' @param subdir_name Character specifying the name of the subdirectory that the -#' output files will be placed. +#' @param subdir_name The name of the subdirectory that the output files will be +#' placed. #' @inheritParams rmspc::mspc #' @inheritDotParams MACSr::callpeak -tfile -cfile -outdir -name -format -log #' -tempdir @@ -27,8 +27,7 @@ #' replicateType = "Biological", #' stringencyThreshold = 1e-8, #' weakThreshold = 1e-4, -#' c = 3, -#' ... +#' c = 3 #' ) #' } #' diff --git a/man/idr_analysis.Rd b/man/idr_analysis.Rd index cb9af12..46f8915 100644 --- a/man/idr_analysis.Rd +++ b/man/idr_analysis.Rd @@ -114,8 +114,9 @@ A list containing a summary of the IDR analysis along with the path to the output files. } \description{ -\code{idr_analysis()} calls consensus peaks using IDR thresholding. IDR -analysis can be used to generate a "conservative" or "optimal" set of peaks. +\code{idr_analysis()} generates consensus peaks using IDR thresholding. The +functions calls peaks with MACSr and then filters these peaks using IDR. IDR +can be used to generate a "conservative" or "optimal" set of peaks. } \examples{ \dontrun{ @@ -126,6 +127,7 @@ idr_analysis(treat_files = c(input1, input2), control_files = NULL, type = "all", is_paired = FALSE, + keep_original = FALSE, out_dir = tempdir() ) } diff --git a/man/multiple_replicates_chipr.Rd b/man/multiple_replicates_chipr.Rd index 1106806..5d46f32 100644 --- a/man/multiple_replicates_chipr.Rd +++ b/man/multiple_replicates_chipr.Rd @@ -2,7 +2,7 @@ % Please edit documentation in R/multiple_replicates_chipr.R \name{multiple_replicates_chipr} \alias{multiple_replicates_chipr} -\title{Call peaks with MACSr and generate consensus set with ChIP-R} +\title{Multiple replicate peak calling with ChIP-R} \usage{ multiple_replicates_chipr( treat_files, @@ -34,8 +34,8 @@ paired-end. The default is `FALSE`.} will be created. By default, the results directories are created in tempdir().} -\item{subdir_name}{Character specifying the name of the subdirectory that the -output files will be placed.} +\item{subdir_name}{The name of the subdirectory that the output files will be +placed.} \item{minentries}{The minimum number of intersections a given peak must satisfy.} @@ -136,7 +136,8 @@ output files. } \description{ \code{multiple_replicates_chipr()} is a wrapper of the Python package ChIP-R. -ChIP-R handles an arbitrary number of replicates. +The function calls peaks with MACSr and then filters these peaks to generate +a consensus set using the Python package ChIP-R. } \examples{ \dontrun{ @@ -145,8 +146,7 @@ input2 <- testthat::test_path("testdata", "r2_test_creb.bam") input3 <- testthat::test_path("testdata", "r3_test_creb.bam") multiple_replicates_chipr(treat_files = c(input1, input2, input3), - out_dir = tempdir(), - ... + out_dir = tempdir() ) } diff --git a/man/multiple_replicates_mspc.Rd b/man/multiple_replicates_mspc.Rd index 1bf3546..be35c01 100644 --- a/man/multiple_replicates_mspc.Rd +++ b/man/multiple_replicates_mspc.Rd @@ -34,8 +34,8 @@ paired-end. The default is `FALSE`.} will be created. By default, the results directories are created in tempdir().} -\item{subdir_name}{Character specifying the name of the subdirectory that the -output files will be placed.} +\item{subdir_name}{The name of the subdirectory that the output files will be +placed.} \item{replicateType}{Character string. This argument defines the replicate type. Possible values of the argument : @@ -147,8 +147,8 @@ output files. } \description{ \code{multiple_replicates_mspc()} runs multiple sample peak calling from the -rmspc package. The function handles an arbitrary number of biological -replicates. +rmspc package. The function calls peaks with MACSr and then filters these +peaks to generate a consensus set using the mspc function from rmspc. } \examples{ \dontrun{ @@ -163,8 +163,7 @@ multiple_replicates_mspc(treat_files = c(input1, input2, input3), replicateType = "Biological", stringencyThreshold = 1e-8, weakThreshold = 1e-4, - c = 3, - ... + c = 3 ) } diff --git a/vignettes/ConsensusPeak.Rmd b/vignettes/ConsensusPeak.Rmd index bba1af6..27bbdb4 100644 --- a/vignettes/ConsensusPeak.Rmd +++ b/vignettes/ConsensusPeak.Rmd @@ -25,27 +25,6 @@ challenge in terms of ease-of-use. To address this issue, we introduce *ConsensusPeak*, an R package that aggregates several consensus peak calling metrics within a cohesive and straightforward interface. -The user simply has to supply BAM files. Peak calling is performed internally -using MACS3 and the output is fed directly into the thresholding methods. -What other thresholding methods could we try? Or what other features could we -include? - -We could (1) merge bam (2) call merged peaks (3) call peaks on individual -replicates (4) consensus peak is peak found in all or n number? - -- Irreproducible Discovery Rate (IDR) (n = 2) - - Optimal - - Conservative -- Multiple Sample Peak Calling (MSPC) -- Overlapping peak calls using findOverlapsOfPeaks (n < 5) -- Majority-vote. Peaks in more than half of the replicates. Lets use something -- other than findoverlapsofpeaks as we are limited to 4 replicates. We can pick -a reference and then calculate overlap from the perspective of this reference. -- CHiP-R. - -Run all methods and then compare optimal peaks -using EpiCompare. - @@ -64,19 +43,67 @@ knitr::opts_chunk$set( ``` ```{r setup} +# ConsensusPeak can currently only be installed from GitHub +#if(!require("remotes")) install.packages("remotes") +#remotes::install_github("neurogenomics/ConsensusPeak") + library(ConsensusPeak) ``` -```{r eval=FALSE} -# code here -input1 <- system.file("extdata", "r1_test_creb.bam", package = "ConsensusPeak") -input2 <- system.file("extdata", "r2_test_creb.bam", package = "ConsensusPeak") +The package is distributed with small example data sets. These can be loaded +with the following commands: +```{r eval=TRUE} +input1 <- system.file("extdata", "r1_creb_chr22.bam", package = "ConsensusPeak") +input2 <- system.file("extdata", "r2_creb_chr22.bam", package = "ConsensusPeak") +input3 <- system.file("extdata", "r3_creb_chr22.bam", package = "ConsensusPeak") +``` +A popular way to generate consensus peaks is via the Irreproducible Discovery +Rate (IDR). This method is a measure of peak rank consistency between +two replciates, and is used in the ENCODE project. Here, we implement two +variants of IDR: conservative and optimal. Despite its popularity, IDR analysis +can only be performed with two replicates. +```{r eval=TRUE} + +idr_analysis(treat_files = c(input1, input2), # BAM files + control_files = NULL, + type = "all", # all runs both the conservative and optimal methods. + is_paired = FALSE, + out_dir = ".", + nomodel = TRUE + ) +``` +We have implemented two methods that can handle more than two replicates. These +are MSPC and ChIP-R. Note that MSPC requires .NET 6.0 or higher to be installed +on your system. + +```{r eval=TRUE} + +multiple_replicates_mspc(treat_files = c(input1, input2, input3), + out_dir = ".", + subdir_name = "mspc_analysis", + is_paired = FALSE, + replicateType = "Biological", + stringencyThreshold = 1e-8, + weakThreshold = 1e-4, + c = 3, + nomodel = TRUE + ) +``` +```{r eval=TRUE} +multiple_replicates_chipr(treat_files = c(input1, input2, input3), + is_paired = FALSE, + out_dir = ".", + nomodel = TRUE + ) ``` + + +