-
Notifications
You must be signed in to change notification settings - Fork 0
/
03-import.Rmd
64 lines (41 loc) · 2.29 KB
/
03-import.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# Importing microbiome data
This section demonstrates how to import microbiome profiling data in R.
## Data access
**Premade data set**
```{r warning = FALSE, message = FALSE}
# Check the path of your file
tse <- readRDS("./projectfolder/datafolder/filename.rds") # change to correct path
print(tse)
```
[**Import data from other sources**](https://microbiome.github.io/OMA/containers.html#s-workflow)
*Openly available data set*, see e.g.:
- [Bioconductor microbiomeDataSets](https://bioconductor.org/packages/release/data/experiment/html/microbiomeDataSets.html)
## Importing microbiome data in R
**Import example data** by modifying the examples in the online book
section on [data exploration and
manipulation](https://microbiome.github.io/OMA/data-introduction.html#loading-experimental-microbiome-data). The
data files in our example are in _biom_ format, which is a standard
file format for microbiome data. Other file formats exist as well, and
import details vary by platform.
Here, we import _biom_ data files into a specific data container (structure)
in R, _TreeSummarizedExperiment_ (TSE) [Huang et
al. (2020)](https://f1000research.com/articles/9-1246). This provides
the basis for downstream data analysis in the _miaverse_ data science
framework.
In this course, we focus on downstream analysis of taxonomic profiling
data, and assume that the data has already been appropriately
preprocessed and available in the TSE format. In addition to our
example data, further demonstration data sets are readily available in
the TSE format through
[microbiomeDataSets](https://bioconductor.org/packages/release/data/experiment/html/microbiomeDataSets.html).
<img src="https://raw.githubusercontent.com/FelixErnst/TreeSummarizedExperiment
/2293440c6e70ae4d6e978b6fdf2c42fdea7fb36a/vignettes/tse2.png" width="100%"/>
**Figure sources:**
**Original article**
- Huang R _et al_. (2021) [TreeSummarizedExperiment: a S4 class
for data with hierarchical structure](https://doi.org/10.12688/
f1000research.26669.2). F1000Research 9:1246.
**Reference Sequence slot extension**
- Lahti L _et al_. (2020) [Upgrading the R/Bioconductor ecosystem for microbiome
research](https://doi.org/10.7490/
f1000research.1118447.1) F1000Research 9:1464 (slides).