-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathminimal_workflow.Rmd
109 lines (74 loc) · 2.24 KB
/
minimal_workflow.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
title: "Minimal Workflow"
author: "Mark"
date: "15/03/2022"
output:
word_document: default
html_document: default
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE,warning = FALSE,message = FALSE)
```
## Introduction
Introduce the biological conditions being investigated...
## Data Import
Describe how the data are imported. Adjust any file paths etc as required...
```{r}
library(readr)
library(tximport)
library(DESeq2)
library(dplyr)
dirs <- list.files("salmon_quant/")
quant_files <- list.files("salmon_quant/",pattern="quant.sf.gz",recursive = TRUE,full.names = TRUE)
names(quant_files) <- dirs
tx2gene <- read_csv("tx2gene.csv",col_names = FALSE)
txi <- tximport(quant_files,
type="salmon",
tx2gene = tx2gene,
ignoreTxVersion = TRUE)
sampleInfo <- read_tsv("meta_data/sampleInfo_corrected.txt")
dds <- DESeqDataSetFromTximport(txi,
colData = sampleInfo,
design = ~Treated)
```
## Quality Assessment
Check the number of reads...
```{r}
```
Verify Sample Relationships
```{r}
vsd <- vst(dds)
plotPCA(vsd, intgroup = "Treated")
```
Describe and interpret the plot
## Differential Expression
This will do the differential expression for the design chosen when you created the dds object. The results function may need to change to perform different comparisons.
```{r}
de <- DESeq(dds)
results <- results(de, tidy=TRUE)
```
## Annotation
Use the `org.Hs.eg.db` package to add extra columns to the results. If your dataset is not human, you will need to change the name of this package
```{r}
library(org.Hs.eg.db)
anno <- AnnotationDbi::select(org.Hs.eg.db, keys = rownames(dds),
keytype = "ENSEMBL",
columns = c("SYMBOL","GENENAME","ENTREZID"))
```
## Pathways analysis
Use the clusterProfiler package to identify biologically-relevant pathways
```{r}
library(clusterProfiler)
universe <- results %>% pull(row)
sigGenes <- results %>%
filter(padj < 0.05, !is.na(row)) %>% pull(row)
enrich_go <- enrichGO(
gene= sigGenes,
OrgDb = org.Hs.eg.db,
keyType = "ENSEMBL",
ont = "BP",
universe = universe,
qvalueCutoff = 0.05,
readable=TRUE
)
```