From 51b1667e6517303df0b5976fe8709a1a3a587107 Mon Sep 17 00:00:00 2001
From: gthopkins <54715703+gthopkins@users.noreply.github.com>
Date: Tue, 3 Sep 2024 10:13:28 -0400
Subject: [PATCH] Describe plotting functions in more detail

I am responding to issue #144 to describe the plotting functions in more detail, and also describing how to access data from the functions so users can create custom plots.
---
 vignettes/corncob-intro-no-phyloseq.Rmd | 16 ++++++++++++++-
 vignettes/corncob-intro.Rmd             | 26 +++++++++++++++++++++++--
 2 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/vignettes/corncob-intro-no-phyloseq.Rmd b/vignettes/corncob-intro-no-phyloseq.Rmd
index bfb2807..8b885aa 100644
--- a/vignettes/corncob-intro-no-phyloseq.Rmd
+++ b/vignettes/corncob-intro-no-phyloseq.Rmd
@@ -118,6 +118,7 @@ First, let's plot the data with our model fit on the relative abundance scale. T
 plot(corncob, B = 50)
 ```
 
+You can access the documentation for this plotting function by typing `?plot.bbdml` into the console. The points represent the relative abundances.
 The points represent the relative abundances. The bars represent the 95\% prediction intervals for the observed relative abundance by sample. 
 The parameter `B` determines the number of bootstrap simulations used to approximate the prediction intervals. For purposes of this tutorial, we use a small value `B = 50` for computational purposes, but recommend a higher setting for more accurate intervals, such as the default `B = 1000`.
 
@@ -258,7 +259,7 @@ da_analysis$p_fdr[1:5]
 
 where the values are now adjusted to control the false discovery rate at 0.05.
 
-We can also plot the model coefficients of our results:
+Now we can use the built-in plotting function in corncob to view the the model coefficients used in testing differential abundance across `DayAmdmt` via the `plot()` function. You can access the documentation for this plotting function by typing `?plot.differentialTest` into the console.
 
 ```{r }
 plot(da_analysis)
@@ -266,6 +267,19 @@ plot(da_analysis)
 
 Here, we can see that for `Bacteria_Armatimonadetes`, the effect of `DayAmdmt21` is positive compared to the baseline (in this case, `DayAmdmt11`).
 
+If you wish to instead make your own custom plots with these same coefficients, you can easily extract the data used to generate the plots above from the `plot` function by setting `data_only = TRUE`:
+
+```{r}
+df <- plot(da_analysis, data_only = TRUE)
+
+# we can easily remove special characters used in our formatting steps
+df <- df %>%
+  mutate(variable = gsub("\nDifferential Abundance", "",
+                         variable, fixed = TRUE))
+
+head(df)
+```
+
 Finally, we can see a list of any taxa for which we were not able to fit a model using:
 
 ```{r }
diff --git a/vignettes/corncob-intro.Rmd b/vignettes/corncob-intro.Rmd
index f933992..1889ce1 100644
--- a/vignettes/corncob-intro.Rmd
+++ b/vignettes/corncob-intro.Rmd
@@ -142,7 +142,8 @@ First, let's plot the data with our model fit on the relative abundance scale. T
 plot(corncob, B = 50)
 ```
 
-The points represent the relative abundances. The bars represent the 95\% prediction intervals for the observed relative abundance by sample. 
+You can access the documentation for this plotting function by typing `?plot.bbdml` into the console. The points represent the relative abundances.
+The bars represent the 95\% prediction intervals for the observed relative abundance by sample. 
 The parameter `B` determines the number of bootstrap simulations used to approximate the prediction intervals. For purposes of this tutorial, we use a small value `B = 50` for computational purposes, but recommend a higher setting for more accurate intervals, such as the default `B = 1000`.
 
 Now let's look at the same plot, but on the counts scale with 95\% prediction intervals (since counts is not a parameter). To do this, we add the option `total = TRUE` to our plotting code.
@@ -166,6 +167,14 @@ Notice that this plot also reorders our samples so that groups appear together s
 
 We can observe on this plot that it might be of interest to distinguish between the two groups with covariates. The average empirical relative abundance for the samples with `DayAmdmt = 21` tends to be lower and less variable than the samples with `DayAmdmt = 11`.
 
+If you wish to instead make your own custom plots with these same coefficients, you can easily extract the data used to generate the plots above from the `plot` function by setting `data_only = TRUE`:
+
+```{r}
+df <- plot(corncob, color = "DayAmdmt", B = 50, data_only = TRUE)
+head(df)
+```
+
+
 # Adding covariates
 
 Let's try modeling the expected relative abundance and the variability of the counts with `DayAmdmt` as a covariate. We do this by modifying `formula` and `phi.formula` as:
@@ -288,7 +297,7 @@ da_analysis$p_fdr[1:5]
 
 where the values are now adjusted to control the false discovery rate at 0.05.
 
-We can also plot the model coefficients of our results:
+Now we can use the built-in plotting function in corncob to view the the model coefficients used in testing differential abundance across `DayAmdmt` via the `plot()` function. You can access the documentation for this plotting function by typing `?plot.differentialTest` into the console.
 
 ```{r, eval = phy}
 plot(da_analysis)
@@ -296,6 +305,19 @@ plot(da_analysis)
 
 Here, we can see that for `Bacteria_Armatimonadetes`, the effect of `DayAmdmt21` is positive compared to the baseline (in this case, `DayAmdmt11`).
 
+If you wish to instead make your own custom plots with these same coefficients, you can easily extract the data used to generate the plots above from the `plot` function by setting `data_only = TRUE`:
+
+```{r}
+df <- plot(da_analysis, data_only = TRUE)
+
+# we can easily remove special characters used in our formatting steps
+df <- df %>%
+  mutate(variable = gsub("\nDifferential Abundance", "",
+                         variable, fixed = TRUE))
+
+head(df)
+```
+
 Finally, we can see a list of any taxa for which we were not able to fit a model using:
 
 ```{r, eval = phy}