-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathREADME.Rmd
113 lines (77 loc) · 3.03 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# mmsig
<!-- badges: start -->
<!-- badges: end -->
The goal of mmsig is to provide a flexible and easily interpretable mutational signature analysis tool. mmsig was developed for hematological malignancies, but can be extended to any cancer with a well-known mutational signature landscape.
mmsig is based on an expectation maximization algorithm for mutational signature fitting and applies cosine similarities for dynamic error suppression as well as bootstrapping-based confidence intervals and assessment of transcriptional strand bias.
Citation:
Rustad, E.H., Nadeu, F., Angelopoulos, N. et al. mmsig: a fitting approach to accurately identify somatic mutational signatures in hematological malignancies. Commun Biol 4, 424 (2021). https://doi.org/10.1038/s42003-021-01938-0
## Installation
You can install the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("evenrus/mmsig")
```
# Example
This is a basic example which shows mmsig usage
```{r}
library(mmsig)
data(mm_5_col)
data(signature_ref)
```
## setting up the mutational signature reference
```{r}
# remove canonical AID (SBS84) for genome-wide analysis
# remove the platinum signature (SBS35) because the patients are not platinum exposed
sig_ref <- signature_ref[c("sub", "tri", "SBS1", "SBS2", "SBS5", "SBS8",
"SBS9", "SBS13", "SBS18", "SBS-MM1")]
head(sig_ref)
```
## Setting up the mutation data
```{r}
# subset three samples to reduce run time
mm_5_col_subset <- mm_5_col[mm_5_col$sample %in% c("MEL_PD26412a", "MEL_PD26411c", "PD26414a"),]
head(mm_5_col_subset)
```
## Perform mutational signature analysis
```{r}
# Bootstrapping large datasets with many iterations can significantly increase runtime.
set.seed(1)
sig_out <- mm_fit_signatures(muts.input=mm_5_col_subset,
sig.input=sig_ref,
input.format = "vcf",
sample.sigt.profs = NULL,
strandbias = TRUE,
bootstrap = TRUE,
iterations = 20, # 1000 iterations recommended for stable results
refcheck=TRUE,
cos_sim_threshold = 0.01,
force_include = c("SBS1", "SBS5"),
dbg=FALSE)
```
## Plot signature estimates
```{r}
plot_signatures(sig_out$estimate,
samples = T,
sig_order = c("SBS1", "SBS2", "SBS13", "SBS5", "SBS8", "SBS9",
"SBS18", "SBS-MM1", "SBS35"))
```
## Plot bootstraping confidence intervals
```{r}
bootSigsPlot(sig_out$bootstrap)
```
## Transcriptional strand bias for SBS-MM1
```{r}
head(sig_out$strand_bias_mm1)
```