microbiome · TuomasBorman · Dec 20, 2024 · Dec 20, 2024 · Dec 20, 2024 · Dec 20, 2024
diff --git a/inst/assets/bibliography.bib b/inst/assets/bibliography.bib
@@ -2321,3 +2321,16 @@ @article{mangiola2023
     eprint = {https://www.pnas.org/doi/pdf/10.1073/pnas.2203828120},
     }
 
+@article{Marchesi2015,
+  title = {The vocabulary of microbiome research: a proposal},
+  volume = {3},
+  ISSN = {2049-2618},
+  url = {http://dx.doi.org/10.1186/s40168-015-0094-5},
+  DOI = {10.1186/s40168-015-0094-5},
+  number = {1},
+  journal = {Microbiome},
+  publisher = {Springer Science and Business Media LLC},
+  author = {Marchesi,  Julian R. and Ravel,  Jacques},
+  year = {2015},
+  month = jul 
+}
diff --git a/inst/pages/alpha_diversity.qmd b/inst/pages/alpha_diversity.qmd
@@ -123,7 +123,7 @@ For example:  `index = c("observed", "shannon")`
 Let's visualize the results against selected `colData` variables (sample
 type and final barcode).
 
-```{r plot-div-obs, message=FALSE, fig.cap="Shannon diversity estimates plotted grouped by sample type with colour-labeled barcode.", cache=TRUE}
+```{r plot-div-obs, message=FALSE, fig.cap="Species richness plotted grouped by sample type with colour-labeled barcode.", cache=TRUE}
 library(scater)
 plotColData(
     tse,

diff --git a/inst/pages/cross_correlation.qmd b/inst/pages/cross_correlation.qmd
@@ -79,9 +79,6 @@ mae[[1]] <- agglomerateByPrevalence(mae[[1]], rank = "Family", na.rm = TRUE)
 # Does log10 transform for microbiome data
 mae[[1]] <- transformAssay(mae[[1]], method = "log10", pseudocount = TRUE)
 
-# Give unique names, so that we do not have problems when we are creating a plot
-rownames(mae[[1]]) <- getTaxonomyLabels(mae[[1]])
-
 # Cross correlates data sets
 res <- getCrossAssociation(
     mae,

diff --git a/inst/pages/exercises.qmd b/inst/pages/exercises.qmd
@@ -1033,7 +1033,7 @@ that by default only the first two dimensions are shown.
 4. Check which information is stored in the ColData of the TreeSE. What would
 be worth visualizing in our coordination plot?
 5. Make the same plot again, but this time colour the observations by
-Enterotype. You can do that by setting `colour.by` to the appropriate colname
+Enterotype. You can do that by setting `colour_by` to the appropriate colname
 in the colData of the TreeSE.
 6. **Extra**: Plot all three dimensions of PCA with `scater::plotReducedDim`
 and the optional argument `ncomponents`. Colour observations by Enterotype.
@@ -1054,7 +1054,7 @@ assay in terms of Bray-Curtis dissimilarity. You can use `scater::runMDS`
 with the compulsory argument `FUN = vegan::vegdist`.
 4. Plot the first two dimensions of PCA with `plotReducedDim`, to which you
 should give the appropriate reducedDim name as the second argument. Colour
-the observations by Enterotype with `colour.by`.
+the observations by Enterotype with `colour_by`.
 5. **Extra**: Perform MDS again with `scater::runMDS`, but this time use Jaccard
 dissimilarity. The distance metric to use can be defined with the optional
 argument `method`, choosing from the methods in `?vegan::vegdist`. If you

diff --git a/inst/pages/integrated_learner.qmd b/inst/pages/integrated_learner.qmd
@@ -92,7 +92,7 @@ mae[[2]] <- transformAssay(
 ```
 
 Ultimately, `r nrow(mae[[1]])+nrow(mae[[1]])` features are retained, consisting
-of `r nrow(mae[[1]])` pathways and `r nrow(mae[[2]])` species.
+of `r nrow(mae[[1]])` species and `r nrow(mae[[2]])` pathways.
 
 ## Fit model
 

diff --git a/inst/pages/intro.qmd b/inst/pages/intro.qmd
@@ -56,6 +56,11 @@ The Bioconductor microbiome data science framework consists of:
 
 ## Microbiome data science in Bioconductor {#sec-microbiome-bioc}
 
+While microbiota is used to refer micro-organisms within well-specified area,
+microbiome means microbiota and their genetic material [@Marchesi2015].
+Because the complex nature of the microbiome data, computational methods are
+essential in microbiome research.
+
 The `phyloseq` data container has been dominant in the microbiome field within
 Bioconductor over the past decade [@McMurdie2013]. However, there has been a
 growing popularity of tools based on the `SummarizedExperiment` framework.

diff --git a/inst/pages/multiassay_ordination.qmd b/inst/pages/multiassay_ordination.qmd
@@ -110,7 +110,7 @@ train_opts |> head()
 
 The model is then prepared  with `prepare_mofa()` and trained with `run_mofa()`:
 
-```{r, message=FALSE, warning=FALSE}
+```{r, results=FALSE}
 #| label: mofa6
 
 model <- prepare_mofa(
@@ -123,7 +123,8 @@ model <- prepare_mofa(
 model <- run_mofa(model, use_basilisk = TRUE)
 ```
 
-The explained variance is visualized with the `plot_variance_explained()` function.
+The explained variance is visualized with the `plot_variance_explained()`
+function.
 
 ```{r, message=FALSE, warning=FALSE, fig.height=8, fig.width=10}
 #| label: mofa7

diff --git a/inst/pages/transformation.qmd b/inst/pages/transformation.qmd
@@ -101,17 +101,19 @@ available in the function
 
 ::: {.callout-important}
 
-`Pseudocount` is a small non-negative value added to the normalized data to avoid 
-taking the logarithm of zero. It's value can have a significant impact on the results when applying 
-a logarithm transformation to normalized data, as the logarithm transformation 
-is a nonlinear operation that can fundamentally change the data distribution [@Costea2014].
+`Pseudocount` is a small non-negative value added to the normalized data to
+avoid taking the logarithm of zero. It's value can have a significant impact
+on the results when applying 
+a logarithm transformation to normalized data, as the logarithm transformation
+is a nonlinear operation that can fundamentally change the data distribution
+[@Costea2014].
 
 
-`Pseudocount` should be chosen consistently across all normalization methods being 
-compared, for example, by setting it to a value smaller than the minimum abundance 
-value before transformation. Some tools, like ancombc2, take into account the effect 
-of the `pseudocount` by performing sensitivity tests using multiple pseudocount 
-values. See [@sec-differential-abundance].
+`Pseudocount` should be chosen consistently across all normalization methods
+being compared, for example, by setting it to a value smaller than the minimum
+abundance value before transformation. Some tools, like ancombc2, take into
+account the effect of the `pseudocount` by performing sensitivity tests using
+multiple pseudocount values. See [@sec-differential-abundance].
 
 :::
 
@@ -123,15 +125,13 @@ library(mia)
 data("GlobalPatterns", package = "mia")
 tse <- GlobalPatterns
 
-# Transform "counts" assay to relative abundances ("relabundance"), with
-# pseudocount 1
-tse <- transformAssay(
-     tse, assay.type = "counts", method = "relabundance", pseudocount = 1)
+# Transform "counts" assay to relative abundances ("relabundance")
+tse <- transformAssay(tse, assay.type = "counts", method = "relabundance")
 
-# Transform relative abundance assay ("relabundance") to "clr", using
-# pseudocount if necessary; name the resulting assay to "clr"
+# Transform "counts" to "clr", using pseudocount if necessary; name
+# the resulting assay to "clr".
 tse <- transformAssay(
-    x = tse, assay.type = "relabundance", method = "clr", pseudocount = TRUE,
+    x = tse, assay.type = "counts", method = "clr", pseudocount = TRUE,
     name = "clr")
 
 ```