Skip to content

Commit 3aa2d4d

Browse files
authored
Merge pull request #33 from m-jahn/dev
various fixes
2 parents 3984adf + 27e4b94 commit 3aa2d4d

14 files changed

+190
-147
lines changed

DESCRIPTION

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,14 @@ Authors@R: c(
88
comment = c(ORCID = "0000-0002-3913-153X"))
99
)
1010
Maintainer: Michael Jahn <[email protected]>
11-
Description: The goal of 'ggcoverage' is to simplify the process of
12-
visualizing genome/protein coverage. It contains functions to load
13-
data from BAM, BigWig, BedGraph or txt/xlsx files, create
14-
genome/protein coverage plots, add various annotations to the coverage
15-
plot, including base and amino acid annotation, GC annotation, gene
16-
annotation, transcript annotation, ideogram annotation, peak
17-
annotation, contact map annotation, link annotation and protein
18-
feature annotation.
11+
Description: The goal of `ggcoverage` is to visualize coverage tracks from
12+
genomics, transcriptomics or proteomics data. It contains functions to
13+
load data from BAM, BigWig, BedGraph, txt, or xlsx files, create
14+
genome/protein coverage plots, and add various annotations including
15+
base and amino acid composition, GC content, copy number variation
16+
(CNV), genes, transcripts, ideograms, peak highlights, HiC contact
17+
maps, contact links and protein features. It is based on and
18+
integrates well with `ggplot2`.
1919
License: MIT + file LICENSE
2020
URL: https://showteeth.github.io/ggcoverage/,
2121
https://github.com/showteeth/ggcoverage

NAMESPACE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ importFrom(Biostrings,readAAStringSet)
5555
importFrom(Biostrings,readDNAStringSet)
5656
importFrom(Biostrings,translate)
5757
importFrom(GenomeInfoDb,"seqlengths<-")
58+
importFrom(GenomeInfoDb,genome)
5859
importFrom(GenomeInfoDb,seqlengths)
5960
importFrom(GenomeInfoDb,seqnames)
6061
importFrom(GenomicAlignments,alphabetFrequencyFromBam)

R/geom_ideogram.R

Lines changed: 2 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
#' @importFrom utils menu
3232
#' @importFrom rtracklayer ucscGenomes ucscTableQuery tableName getTable
3333
#' GRangesForUCSCGenome browserSession
34-
#' @importFrom GenomeInfoDb seqlengths seqlengths<- seqnames
34+
#' @importFrom GenomeInfoDb seqlengths genome seqlengths<- seqnames
3535
#' @export
3636
#'
3737
#' @examples
@@ -95,22 +95,6 @@ geom_ideogram <- function(genome = "hg19", mark.color = "red", mark.alpha = 0.7,
9595

9696
#' @export
9797
ggplot_add.ideogram <- function(object, plot, object_name) {
98-
# if (length(plot$layers) == 0) {
99-
# # geom_base
100-
# # get plot data
101-
# plot.data <- plot[[1]]$layers[[1]]$data
102-
# # prepare plot range
103-
# plot.chr <- as.character(plot.data[1, "seqnames"])
104-
# plot.region.start <- plot.data[1, "start"]
105-
# plot.region.end <- plot.data[nrow(plot.data), "end"]
106-
# } else {
107-
# # get plot data
108-
# plot.data <- plot$layers[[1]]$data
109-
# # prepare plot range
110-
# plot.chr <- as.character(plot.data[1, "seqnames"])
111-
# plot.region.start <- plot$coordinates$limits$x[1]
112-
# plot.region.end <- plot$coordinates$limits$x[2]
113-
# }
11498
# get plot data, plot data should contain bins
11599
if ("patchwork" %in% class(plot)) {
116100
plot.data <- plot[[1]]$layers[[1]]$data
@@ -146,7 +130,7 @@ ggplot_add.ideogram <- function(object, plot, object_name) {
146130
plot.height <- object$plot.height
147131

148132
# get genome and chr ideogram
149-
genome.info <- suppressWarnings(getIdeogram(genome = genome, subchr = plot.chr, cytobands = TRUE))
133+
genome.info <- suppressWarnings(getIdeogram(genomes = genome, subchr = plot.chr, cytobands = TRUE))
150134
genome.info.df <- genome.info %>% as.data.frame()
151135
# get genome length
152136
genome.length <- genome.info.df[nrow(genome.info.df), "end"]

R/utils.R

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -242,9 +242,8 @@ SplitTxExonUTR <- function(exon.df, utr.df) {
242242
# From: https://github.com/jorainer/biovizBase/blob/master/R/ideogram.R
243243
# Fix bug: the names on the supplied 'seqlengths' vector must be
244244
# identical to the seqnames
245-
getIdeogram <- function(genome, subchr = NULL, cytobands = TRUE) {
246-
.gnm <- genome
247-
lst <- lapply(.gnm, function(genome) {
245+
getIdeogram <- function(genomes, subchr = NULL, cytobands = TRUE) {
246+
lst <- lapply(genomes, function(genome) {
248247
if (!(exists("session") && extends(class(session), "BrowserSession"))) {
249248
session <- rtracklayer::browserSession()
250249
}
@@ -255,8 +254,9 @@ getIdeogram <- function(genome, subchr = NULL, cytobands = TRUE) {
255254
}
256255
if (cytobands) {
257256
message("Loading ideogram...")
257+
GenomeInfoDb::genome(session) <- genome
258258
tryres <- try(query <-
259-
rtracklayer::ucscTableQuery(session, "cytoBand", genome))
259+
rtracklayer::ucscTableQuery(session, table = "cytoBand", genome = genome))
260260
if (!inherits(tryres, "try-error")) {
261261
rtracklayer::tableName(query) <- "cytoBand"
262262
df <- rtracklayer::getTable(query)
@@ -292,7 +292,7 @@ getIdeogram <- function(genome, subchr = NULL, cytobands = TRUE) {
292292
gr <- sort(gr)
293293
gr
294294
})
295-
names(lst) <- .gnm
295+
names(lst) <- genomes
296296
if (length(lst) == 1) {
297297
res <- lst[[1]]
298298
} else {

README.Rmd

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,9 @@ knitr::opts_chunk$set(
2828

2929
## Introduction
3030

31-
The goal of `ggcoverage` is simplify the process of visualizing omics coverage. It contains three main parts:
31+
The goal of `ggcoverage` is to visualize coverage tracks from genomics, transcriptomics or proteomics data. It contains functions to load data from BAM, BigWig, BedGraph, txt, or xlsx files, create genome/protein coverage plots, and add various annotations including base and amino acid composition, GC content, copy number variation (CNV), genes, transcripts, ideograms, peak highlights, HiC contact maps, contact links and protein features. It is based on and integrates well with `ggplot2`.
32+
33+
It contains three main parts:
3234

3335
* **Load the data**: `ggcoverage` can load BAM, BigWig (.bw), BedGraph, txt/xlsx files from various omics data, including WGS, RNA-seq, ChIP-seq, ATAC-seq, proteomics, et al.
3436
* **Create omics coverage plot**
@@ -44,12 +46,9 @@ The goal of `ggcoverage` is simplify the process of visualizing omics coverage.
4446
* **link annotation**: Visualize genome coverage with contacts
4547
* **peotein feature annotation**: Visualize protein coverage with features
4648

47-
`ggcoverage` utilizes `ggplot2` plotting system, so its usage is **ggplot2-style**!
48-
49-
5049
## Installation
5150

52-
`ggcoverage` is an R package distributed as part of the [CRAN](https://cran.r-project.org/).
51+
`ggcoverage` is an R package distributed as part of the [CRAN repository](https://cran.r-project.org/).
5352
To install the package, start R and enter one of the following commands:
5453

5554
```{r install, eval = FALSE}
@@ -61,9 +60,9 @@ install.package("remotes")
6160
remotes::install_github("showteeth/ggcoverage")
6261
```
6362

64-
In general, it is **recommended** to install from [Github repository](https://github.com/showteeth/ggcoverage) (update more timely).
63+
In general, it is **recommended** to install from the [Github repository](https://github.com/showteeth/ggcoverage) (updated more regularly).
6564

66-
Once `ggcoverage` is installed, it can be loaded as every other package:
65+
Once `ggcoverage` is installed, it can be loaded like every other package:
6766

6867
```{r library, message = FALSE, warning = FALSE}
6968
library("ggcoverage")
@@ -74,14 +73,14 @@ library("ggcoverage")
7473
`ggcoverage` provides two [vignettes](https://showteeth.github.io/ggcoverage/):
7574

7675
* **detailed manual**: step-by-step usage
77-
* **customize the plot**: customize the plot and add additional layer
76+
* **customize the plot**: customize the plot and add additional layers
7877

7978

8079
## RNA-seq data
8180

8281
### Load the data
8382

84-
The RNA-seq data used here are from [Transcription profiling by high throughput sequencing of HNRNPC knockdown and control HeLa cells](https://bioconductor.org/packages/release/data/experiment/html/RNAseqData.HNRNPC.bam.chr14.html), we select four sample to use as example: ERR127307_chr14, ERR127306_chr14, ERR127303_chr14, ERR127302_chr14, and all bam files are converted to bigwig file with [deeptools](https://deeptools.readthedocs.io/en/develop/).
83+
The RNA-seq data used here is from [Transcription profiling by high throughput sequencing of HNRNPC knockdown and control HeLa cells](https://bioconductor.org/packages/release/data/experiment/html/RNAseqData.HNRNPC.bam.chr14.html). We select four samples to use as example: `ERR127307_chr14`, `ERR127306_chr14`, `ERR127303_chr14`, `ERR127302_chr14`, and all bam files were converted to bigwig files with [deeptools](https://deeptools.readthedocs.io/en/develop/).
8584

8685
Load metadata:
8786

@@ -125,7 +124,7 @@ mark_region
125124

126125
### Load GTF
127126

128-
To add **gene annotation**, the gtf file should contain **gene_type** and **gene_name** attributes in **column 9**; to add **transcript annotation**, the gtf file should contain **transcript_name** attribute in **column 9**.
127+
To add **gene annotation**, the gtf file should contain **gene_type** and **gene_name** attributes in **column 9**; to add **transcript annotation**, the gtf file should contain a **transcript_name** attribute in **column 9**.
129128

130129
```{r load_gtf}
131130
gtf_file <-
@@ -230,14 +229,14 @@ basic_coverage +
230229

231230
### Add transcript annotation
232231

233-
**In "loose" stype (default style; each transcript occupies one line)**:
232+
**In "loose" style (default style; each transcript occupies one line)**:
234233

235234
```{r transcript_coverage, warning = FALSE, fig.height = 12, fig.width = 12, fig.align = "center"}
236235
basic_coverage +
237236
geom_transcript(gtf.gr = gtf_gr, label.vjust = 1.5)
238237
```
239238

240-
**In "tight" style (place non-overlap transcripts in one line)**:
239+
**In "tight" style (attempted to place non-overlapping transcripts in one line)**:
241240

242241
```{r transcript_coverage_tight, warning = FALSE, fig.height = 12, fig.width = 12, fig.align = "center"}
243242
basic_coverage +
@@ -436,9 +435,9 @@ head(track_df)
436435

437436
#### Default color scheme
438437

439-
For base and amino acid annotation, we have following default color schemes, you can change with `nuc.color` and `aa.color` parameters.
438+
For base and amino acid annotation, the package comes with the following default color schemes. Color schemes can be changed with `nuc.color` and `aa.color` parameters.
440439

441-
Default color scheme for base annotation is `Clustal-style`, more popular color schemes is available [here](https://www.biostars.org/p/171056/).
440+
THe default color scheme for base annotation is `Clustal-style`, more popular color schemes are available [here](https://www.biostars.org/p/171056/).
442441

443442
```{r base_color_scheme, warning = FALSE, fig.height = 2, fig.width = 6, fig.align = "center"}
444443
# color scheme
@@ -587,7 +586,7 @@ ggcoverage(
587586

588587
## ChIP-seq data
589588

590-
The ChIP-seq data used here are from [DiffBind](https://bioconductor.org/packages/release/bioc/html/DiffBind.html), I select four sample to use as example: Chr18_MCF7_input, Chr18_MCF7_ER_1, Chr18_MCF7_ER_3, Chr18_MCF7_ER_2, and all bam files are converted to bigwig file with [deeptools](https://deeptools.readthedocs.io/en/develop/).
589+
The ChIP-seq data used here is from [DiffBind](https://bioconductor.org/packages/release/bioc/html/DiffBind.html). Four samples are selected as examples: `Chr18_MCF7_input`, `Chr18_MCF7_ER_1`, `Chr18_MCF7_ER_3`, `Chr18_MCF7_ER_2`, and all bam files were converted to bigwig files with [deeptools](https://deeptools.readthedocs.io/en/develop/).
591590

592591
Create metadata:
593592

@@ -679,7 +678,7 @@ The Hi-C method maps chromosome contacts in eukaryotic cells.
679678
For this purpose, DNA and protein complexes are cross-linked and DNA fragments then purified.
680679
As a result, even distant chromatin fragments can be found to interact due to the spatial organization of the DNA and histones in the cell. Hi-C data shows these interactions for example as a contact map.
681680

682-
The Hi-C data are from [pyGenomeTracks: reproducible plots for multivariate genomic datasets](https://academic.oup.com/bioinformatics/article/37/3/422/5879987?login=false).
681+
The Hi-C data is taken from [pyGenomeTracks: reproducible plots for multivariate genomic datasets](https://academic.oup.com/bioinformatics/article/37/3/422/5879987?login=false).
683682

684683
The Hi-C matrix visualization is implemented by [`HiCBricks`](https://github.com/koustav-pal/HiCBricks).
685684
This package needs to be installed separately (it is only 'Suggested' by `ggcoverage`).
@@ -785,7 +784,7 @@ basic_coverage +
785784

786785
## Mass spectrometry protein coverage
787786

788-
[Mass spectrometry (MS) is an important method for the accurate mass determination and characterization of proteins, and a variety of methods and instrumentations have been developed for its many uses](https://en.wikipedia.org/wiki/Protein_mass_spectrometry). After MS, we can check the coverage of protein to check the quality of the data and find the reason why the segment did not appear and improve the experiment.
787+
[Mass spectrometry](https://en.wikipedia.org/wiki/Protein_mass_spectrometry) (MS) is an important method for the accurate mass determination and characterization of proteins, and a variety of methods and instruments have been developed for its many uses. With `ggcoverage`, we can easily inspect the peptide coverage of a protein in order to learn about the quality of the data.
789788

790789
### Load coverage
791790

@@ -855,6 +854,5 @@ protein_coverage +
855854
```
856855

857856
## Code of Conduct
858-
859-
Please note that the `ggcoverage` project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.
860857

858+
Please note that the `ggcoverage` project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.

0 commit comments

Comments
 (0)