Skip to content

Commit

Permalink
Edit documentation and work on vignette.
Browse files Browse the repository at this point in the history
  • Loading branch information
Tomrrr1 committed Mar 7, 2024
1 parent 3d10ebf commit 9c10553
Show file tree
Hide file tree
Showing 8 changed files with 83 additions and 54 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: ConsensusPeak
Title: Call consensus peaks from multiple biological replicates
Version: 0.0.0.9100
Version: 0.0.0.9000
Authors@R:
person(given = "Thomas", family = "Roberts",
email = "[email protected]",
Expand Down
6 changes: 4 additions & 2 deletions R/idr_analysis.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
#' Call consensus peaks using IDR
#'
#' \code{idr_analysis()} calls consensus peaks using IDR thresholding. IDR
#' analysis can be used to generate a "conservative" or "optimal" set of peaks.
#' \code{idr_analysis()} generates consensus peaks using IDR thresholding. The
#' functions calls peaks with MACSr and then filters these peaks using IDR. IDR
#' can be used to generate a "conservative" or "optimal" set of peaks.
#'
#' @param treat_files Character vector containing paths to the treatment BAM
#' files.
Expand Down Expand Up @@ -35,6 +36,7 @@
#' control_files = NULL,
#' type = "all",
#' is_paired = FALSE,
#' keep_original = FALSE,
#' out_dir = tempdir()
#' )
#' }
Expand Down
12 changes: 6 additions & 6 deletions R/multiple_replicates_chipr.R
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
#' Call peaks with MACSr and generate consensus set with ChIP-R
#' Multiple replicate peak calling with ChIP-R
#'
#' \code{multiple_replicates_chipr()} is a wrapper of the Python package ChIP-R.
#' ChIP-R handles an arbitrary number of replicates.
#' The function calls peaks with MACSr and then filters these peaks to generate
#' a consensus set using the Python package ChIP-R.
#'
#' @inheritParams idr_analysis
#' @param subdir_name Character specifying the name of the subdirectory that the
#' output files will be placed.
#' @param subdir_name The name of the subdirectory that the output files will be
#' placed.
#' @inheritParams run_chipr
#' @inheritDotParams MACSr::callpeak -tfile -cfile -outdir -name -format -log
#' -tempdir
Expand All @@ -20,8 +21,7 @@
#' input3 <- testthat::test_path("testdata", "r3_test_creb.bam")
#'
#' multiple_replicates_chipr(treat_files = c(input1, input2, input3),
#' out_dir = tempdir(),
#' ...
#' out_dir = tempdir()
#' )
#' }
#'
Expand Down
11 changes: 5 additions & 6 deletions R/multiple_replicates_mspc.R
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
#' Multiple replicate peak calling with MSPC
#'
#' \code{multiple_replicates_mspc()} runs multiple sample peak calling from the
#' rmspc package. The function handles an arbitrary number of biological
#' replicates.
#' rmspc package. The function calls peaks with MACSr and then filters these
#' peaks to generate a consensus set using the mspc function from rmspc.
#'
#' @inheritParams idr_analysis
#' @param subdir_name Character specifying the name of the subdirectory that the
#' output files will be placed.
#' @param subdir_name The name of the subdirectory that the output files will be
#' placed.
#' @inheritParams rmspc::mspc
#' @inheritDotParams MACSr::callpeak -tfile -cfile -outdir -name -format -log
#' -tempdir
Expand All @@ -27,8 +27,7 @@
#' replicateType = "Biological",
#' stringencyThreshold = 1e-8,
#' weakThreshold = 1e-4,
#' c = 3,
#' ...
#' c = 3
#' )
#' }
#'
Expand Down
6 changes: 4 additions & 2 deletions man/idr_analysis.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 6 additions & 6 deletions man/multiple_replicates_chipr.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 5 additions & 6 deletions man/multiple_replicates_mspc.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

77 changes: 52 additions & 25 deletions vignettes/ConsensusPeak.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,27 +25,6 @@ challenge in terms of ease-of-use. To address this issue, we
introduce *ConsensusPeak*, an R package that aggregates several consensus peak
calling metrics within a cohesive and straightforward interface.

The user simply has to supply BAM files. Peak calling is performed internally
using MACS3 and the output is fed directly into the thresholding methods.
What other thresholding methods could we try? Or what other features could we
include?

We could (1) merge bam (2) call merged peaks (3) call peaks on individual
replicates (4) consensus peak is peak found in all or n number?

- Irreproducible Discovery Rate (IDR) (n = 2)
- Optimal
- Conservative
- Multiple Sample Peak Calling (MSPC)
- Overlapping peak calls using findOverlapsOfPeaks (n < 5)
- Majority-vote. Peaks in more than half of the replicates. Lets use something
- other than findoverlapsofpeaks as we are limited to 4 replicates. We can pick
a reference and then calculate overlap from the perspective of this reference.
- CHiP-R.

Run all methods and then compare optimal peaks
using EpiCompare.

<!-- \begin{itemize} -->
<!-- \item Irreproducible Discovery Rate (IDR) (n = 2) -->
<!-- \begin{enumerate} -->
Expand All @@ -64,19 +43,67 @@ knitr::opts_chunk$set(
```

```{r setup}
# ConsensusPeak can currently only be installed from GitHub
#if(!require("remotes")) install.packages("remotes")
#remotes::install_github("neurogenomics/ConsensusPeak")
library(ConsensusPeak)
```

```{r eval=FALSE}
# code here
input1 <- system.file("extdata", "r1_test_creb.bam", package = "ConsensusPeak")
input2 <- system.file("extdata", "r2_test_creb.bam", package = "ConsensusPeak")
The package is distributed with small example data sets. These can be loaded
with the following commands:
```{r eval=TRUE}
input1 <- system.file("extdata", "r1_creb_chr22.bam", package = "ConsensusPeak")
input2 <- system.file("extdata", "r2_creb_chr22.bam", package = "ConsensusPeak")
input3 <- system.file("extdata", "r3_creb_chr22.bam", package = "ConsensusPeak")
```

A popular way to generate consensus peaks is via the Irreproducible Discovery
Rate (IDR). This method is a measure of peak rank consistency between
two replciates, and is used in the ENCODE project. Here, we implement two
variants of IDR: conservative and optimal. Despite its popularity, IDR analysis
can only be performed with two replicates.
```{r eval=TRUE}
idr_analysis(treat_files = c(input1, input2), # BAM files
control_files = NULL,
type = "all", # all runs both the conservative and optimal methods.
is_paired = FALSE,
out_dir = ".",
nomodel = TRUE
)
```

We have implemented two methods that can handle more than two replicates. These
are MSPC and ChIP-R. Note that MSPC requires .NET 6.0 or higher to be installed
on your system.

```{r eval=TRUE}
multiple_replicates_mspc(treat_files = c(input1, input2, input3),
out_dir = ".",
subdir_name = "mspc_analysis",
is_paired = FALSE,
replicateType = "Biological",
stringencyThreshold = 1e-8,
weakThreshold = 1e-4,
c = 3,
nomodel = TRUE
)
```

```{r eval=TRUE}
multiple_replicates_chipr(treat_files = c(input1, input2, input3),
is_paired = FALSE,
out_dir = ".",
nomodel = TRUE
)
```







0 comments on commit 9c10553

Please sign in to comment.