From 51b1667e6517303df0b5976fe8709a1a3a587107 Mon Sep 17 00:00:00 2001 From: gthopkins <54715703+gthopkins@users.noreply.github.com> Date: Tue, 3 Sep 2024 10:13:28 -0400 Subject: [PATCH] Describe plotting functions in more detail I am responding to issue #144 to describe the plotting functions in more detail, and also describing how to access data from the functions so users can create custom plots. --- vignettes/corncob-intro-no-phyloseq.Rmd | 16 ++++++++++++++- vignettes/corncob-intro.Rmd | 26 +++++++++++++++++++++++-- 2 files changed, 39 insertions(+), 3 deletions(-) diff --git a/vignettes/corncob-intro-no-phyloseq.Rmd b/vignettes/corncob-intro-no-phyloseq.Rmd index bfb2807..8b885aa 100644 --- a/vignettes/corncob-intro-no-phyloseq.Rmd +++ b/vignettes/corncob-intro-no-phyloseq.Rmd @@ -118,6 +118,7 @@ First, let's plot the data with our model fit on the relative abundance scale. T plot(corncob, B = 50) ``` +You can access the documentation for this plotting function by typing `?plot.bbdml` into the console. The points represent the relative abundances. The points represent the relative abundances. The bars represent the 95\% prediction intervals for the observed relative abundance by sample. The parameter `B` determines the number of bootstrap simulations used to approximate the prediction intervals. For purposes of this tutorial, we use a small value `B = 50` for computational purposes, but recommend a higher setting for more accurate intervals, such as the default `B = 1000`. @@ -258,7 +259,7 @@ da_analysis$p_fdr[1:5] where the values are now adjusted to control the false discovery rate at 0.05. -We can also plot the model coefficients of our results: +Now we can use the built-in plotting function in corncob to view the the model coefficients used in testing differential abundance across `DayAmdmt` via the `plot()` function. You can access the documentation for this plotting function by typing `?plot.differentialTest` into the console. ```{r } plot(da_analysis) @@ -266,6 +267,19 @@ plot(da_analysis) Here, we can see that for `Bacteria_Armatimonadetes`, the effect of `DayAmdmt21` is positive compared to the baseline (in this case, `DayAmdmt11`). +If you wish to instead make your own custom plots with these same coefficients, you can easily extract the data used to generate the plots above from the `plot` function by setting `data_only = TRUE`: + +```{r} +df <- plot(da_analysis, data_only = TRUE) + +# we can easily remove special characters used in our formatting steps +df <- df %>% + mutate(variable = gsub("\nDifferential Abundance", "", + variable, fixed = TRUE)) + +head(df) +``` + Finally, we can see a list of any taxa for which we were not able to fit a model using: ```{r } diff --git a/vignettes/corncob-intro.Rmd b/vignettes/corncob-intro.Rmd index f933992..1889ce1 100644 --- a/vignettes/corncob-intro.Rmd +++ b/vignettes/corncob-intro.Rmd @@ -142,7 +142,8 @@ First, let's plot the data with our model fit on the relative abundance scale. T plot(corncob, B = 50) ``` -The points represent the relative abundances. The bars represent the 95\% prediction intervals for the observed relative abundance by sample. +You can access the documentation for this plotting function by typing `?plot.bbdml` into the console. The points represent the relative abundances. +The bars represent the 95\% prediction intervals for the observed relative abundance by sample. The parameter `B` determines the number of bootstrap simulations used to approximate the prediction intervals. For purposes of this tutorial, we use a small value `B = 50` for computational purposes, but recommend a higher setting for more accurate intervals, such as the default `B = 1000`. Now let's look at the same plot, but on the counts scale with 95\% prediction intervals (since counts is not a parameter). To do this, we add the option `total = TRUE` to our plotting code. @@ -166,6 +167,14 @@ Notice that this plot also reorders our samples so that groups appear together s We can observe on this plot that it might be of interest to distinguish between the two groups with covariates. The average empirical relative abundance for the samples with `DayAmdmt = 21` tends to be lower and less variable than the samples with `DayAmdmt = 11`. +If you wish to instead make your own custom plots with these same coefficients, you can easily extract the data used to generate the plots above from the `plot` function by setting `data_only = TRUE`: + +```{r} +df <- plot(corncob, color = "DayAmdmt", B = 50, data_only = TRUE) +head(df) +``` + + # Adding covariates Let's try modeling the expected relative abundance and the variability of the counts with `DayAmdmt` as a covariate. We do this by modifying `formula` and `phi.formula` as: @@ -288,7 +297,7 @@ da_analysis$p_fdr[1:5] where the values are now adjusted to control the false discovery rate at 0.05. -We can also plot the model coefficients of our results: +Now we can use the built-in plotting function in corncob to view the the model coefficients used in testing differential abundance across `DayAmdmt` via the `plot()` function. You can access the documentation for this plotting function by typing `?plot.differentialTest` into the console. ```{r, eval = phy} plot(da_analysis) @@ -296,6 +305,19 @@ plot(da_analysis) Here, we can see that for `Bacteria_Armatimonadetes`, the effect of `DayAmdmt21` is positive compared to the baseline (in this case, `DayAmdmt11`). +If you wish to instead make your own custom plots with these same coefficients, you can easily extract the data used to generate the plots above from the `plot` function by setting `data_only = TRUE`: + +```{r} +df <- plot(da_analysis, data_only = TRUE) + +# we can easily remove special characters used in our formatting steps +df <- df %>% + mutate(variable = gsub("\nDifferential Abundance", "", + variable, fixed = TRUE)) + +head(df) +``` + Finally, we can see a list of any taxa for which we were not able to fit a model using: ```{r, eval = phy}