diff --git a/NEWS.md b/NEWS.md index 40e35b56..dfb6a2b9 100755 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,7 @@ # infer (development version) +* Added missing commas and addressed formatting issues throughout the vignettes and articles. Backticks for package names were removed and missing parentheses for functions were added (@Joscelinrocha). + # infer 1.0.7 * The aliases `p_value()` and `conf_int()`, first deprecated 6 years ago, now diff --git a/README.Rmd b/README.Rmd index 411a2bd3..2331c452 100755 --- a/README.Rmd +++ b/README.Rmd @@ -16,7 +16,7 @@ output: github_document [![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/infer)](https://cran.r-project.org/package=infer) [![Coverage Status](https://img.shields.io/codecov/c/github/tidymodels/infer/main.svg)](https://app.codecov.io/github/tidymodels/infer/?branch=main) -The objective of this package is to perform statistical inference using an expressive statistical grammar that coheres with the `tidyverse` design framework. The package is centered around 4 main verbs, supplemented with many utilities to visualize and extract value from their outputs. +The objective of this package is to perform statistical inference using an expressive statistical grammar that coheres with the tidyverse design framework. The package is centered around 4 main verbs, supplemented with many utilities to visualize and extract value from their outputs. + `specify()` allows you to specify the variable, or relationship between variables, that you're interested in. + `hypothesize()` allows you to declare the null hypothesis. @@ -39,13 +39,13 @@ If you're interested in learning more about randomization-based statistical infe ------------------------------------------------------------------------ -To install the current stable version of `infer` from CRAN: +To install the current stable version of infer from CRAN: ```{r, eval = FALSE} install.packages("infer") ``` -To install the developmental stable version of `infer`, make sure to install `remotes` first. The `pkgdown` website for this version is at [infer.tidymodels.org](https://infer.tidymodels.org/). +To install the developmental stable version of infer, make sure to install remotes first. The pkgdown website for this version is at [infer.tidymodels.org](https://infer.tidymodels.org/). ```{r, eval = FALSE} # install.packages("pak") @@ -113,6 +113,6 @@ null_dist %>% ``` -Note that the formula and non-formula interfaces (i.e. `age ~ partyid` vs. `response = age, explanatory = partyid`) work for all implemented inference procedures in `infer`. Use whatever is more natural for you. If you will be doing modeling using functions like `lm()` and `glm()`, though, we recommend you begin to use the formula `y ~ x` notation as soon as possible. +Note that the formula and non-formula interfaces (i.e., `age ~ partyid` vs. `response = age, explanatory = partyid`) work for all implemented inference procedures in `infer`. Use whatever is more natural for you. If you will be doing modeling using functions like `lm()` and `glm()`, though, we recommend you begin to use the formula `y ~ x` notation as soon as possible. Other resources are available in the package vignettes! See `vignette("observed_stat_examples")` for more examples like the one above, and `vignette("infer")` for discussion of the underlying principles of the package design. diff --git a/vignettes/anova.Rmd b/vignettes/anova.Rmd index 99695f55..cf063eb5 100644 --- a/vignettes/anova.Rmd +++ b/vignettes/anova.Rmd @@ -19,9 +19,9 @@ library(dplyr) library(infer) ``` -In this vignette, we'll walk through conducting an analysis of variance (ANOVA) test using `infer`. ANOVAs are used to analyze differences in group means. +In this vignette, we'll walk through conducting an analysis of variance (ANOVA) test using infer. ANOVAs are used to analyze differences in group means. -Throughout this vignette, we'll make use of the `gss` dataset supplied by `infer`, which contains a sample of data from the General Social Survey. See `?gss` for more information on the variables included and their source. Note that this data (and our examples on it) are for demonstration purposes only, and will not necessarily provide accurate estimates unless weighted properly. For these examples, let's suppose that this dataset is a representative sample of a population we want to learn about: American adults. The data looks like this: +Throughout this vignette, we'll make use of the `gss` dataset supplied by infer, which contains a sample of data from the General Social Survey. See `?gss` for more information on the variables included and their source. Note that this data (and our examples on it) are for demonstration purposes only, and will not necessarily provide accurate estimates unless weighted properly. For these examples, let's suppose that this dataset is a representative sample of a population we want to learn about: American adults. The data looks like this: ```{r glimpse-gss-actual, warning = FALSE, message = FALSE} dplyr::glimpse(gss) @@ -57,7 +57,7 @@ observed_f_statistic <- gss %>% The observed $F$ statistic is `r observed_f_statistic`. Now, we want to compare this statistic to a null distribution, generated under the assumption that age and political party affiliation are not actually related, to get a sense of how likely it would be for us to see this observed statistic if there were actually no association between the two variables. -We can `generate` an approximation of the null distribution using randomization. The randomization approach permutes the response and explanatory variables, so that each person's party affiliation is matched up with a random age from the sample in order to break up any association between the two. +We can `generate()` an approximation of the null distribution using randomization. The randomization approach permutes the response and explanatory variables, so that each person's party affiliation is matched up with a random age from the sample in order to break up any association between the two. ```{r generate-null-f, warning = FALSE, message = FALSE} # generate the null distribution using randomization @@ -116,7 +116,7 @@ p_value Thus, if there were really no relationship between age and political party affiliation, our approximation of the probability that we would see a statistic as or more extreme than `r observed_f_statistic` is approximately `r p_value`. -To calculate the p-value using the true $F$ distribution, we can use the `pf` function from base R. This function allows us to situate the test statistic we calculated previously in the $F$ distribution with the appropriate degrees of freedom. +To calculate the p-value using the true $F$ distribution, we can use the `pf()` function from base R. This function allows us to situate the test statistic we calculated previously in the $F$ distribution with the appropriate degrees of freedom. ```{r} pf(observed_f_statistic$stat, 3, 496, lower.tail = FALSE) diff --git a/vignettes/chi_squared.Rmd b/vignettes/chi_squared.Rmd index 38ea8fe4..5dfb9f49 100644 --- a/vignettes/chi_squared.Rmd +++ b/vignettes/chi_squared.Rmd @@ -21,9 +21,9 @@ library(infer) ### Introduction -In this vignette, we'll walk through conducting a $\chi^2$ (chi-squared) test of independence and a chi-squared goodness of fit test using `infer`. We'll start out with a chi-squared test of independence, which can be used to test the association between two categorical variables. Then, we'll move on to a chi-squared goodness of fit test, which tests how well the distribution of one categorical variable can be approximated by some theoretical distribution. +In this vignette, we'll walk through conducting a $\chi^2$ (chi-squared) test of independence and a chi-squared goodness of fit test using infer. We'll start out with a chi-squared test of independence, which can be used to test the association between two categorical variables. Then, we'll move on to a chi-squared goodness of fit test, which tests how well the distribution of one categorical variable can be approximated by some theoretical distribution. -Throughout this vignette, we'll make use of the `gss` dataset supplied by `infer`, which contains a sample of data from the General Social Survey. See `?gss` for more information on the variables included and their source. Note that this data (and our examples on it) are for demonstration purposes only, and will not necessarily provide accurate estimates unless weighted properly. For these examples, let's suppose that this dataset is a representative sample of a population we want to learn about: American adults. The data looks like this: +Throughout this vignette, we'll make use of the `gss` dataset supplied by infer, which contains a sample of data from the General Social Survey. See `?gss` for more information on the variables included and their source. Note that this data (and our examples on it) are for demonstration purposes only, and will not necessarily provide accurate estimates unless weighted properly. For these examples, let's suppose that this dataset is a representative sample of a population we want to learn about: American adults. The data looks like this: ```{r glimpse-gss-actual, warning = FALSE, message = FALSE} dplyr::glimpse(gss) @@ -41,10 +41,14 @@ gss %>% ggplot2::aes(x = finrela, fill = college) + ggplot2::geom_bar(position = "fill") + ggplot2::scale_fill_brewer(type = "qual") + - ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 45, - vjust = .5)) + - ggplot2::labs(x = "finrela: Self-Identification of Income Class", - y = "Proportion") + ggplot2::theme(axis.text.x = ggplot2::element_text( + angle = 45, + vjust = .5 + )) + + ggplot2::labs( + x = "finrela: Self-Identification of Income Class", + y = "Proportion" + ) ``` If there were no relationship, we would expect to see the purple bars reaching to the same height, regardless of income class. Are the differences we see here, though, just due to random noise? @@ -61,7 +65,7 @@ observed_indep_statistic <- gss %>% The observed $\chi^2$ statistic is `r observed_indep_statistic`. Now, we want to compare this statistic to a null distribution, generated under the assumption that these variables are not actually related, to get a sense of how likely it would be for us to see this observed statistic if there were actually no association between education and income. -We can `generate` the null distribution in one of two ways---using randomization or theory-based methods. The randomization approach approximates the null distribution by permuting the response and explanatory variables, so that each person's educational attainment is matched up with a random income from the sample in order to break up any association between the two. +We can `generate()` the null distribution in one of two ways---using randomization or theory-based methods. The randomization approach approximates the null distribution by permuting the response and explanatory variables, so that each person's educational attainment is matched up with a random income from the sample in order to break up any association between the two. ```{r generate-null-indep, warning = FALSE, message = FALSE} # generate the null distribution using randomization @@ -86,9 +90,10 @@ To get a sense for what these distributions look like, and where our observed st ```{r visualize-indep, warning = FALSE, message = FALSE} # visualize the null distribution and test statistic! null_dist_sim %>% - visualize() + + visualize() + shade_p_value(observed_indep_statistic, - direction = "greater") + direction = "greater" + ) ``` We could also visualize the observed statistic against the theoretical null distribution. To do so, use the `assume()` verb to define a theoretical null distribution and then pass it to `visualize()` like a null distribution outputted from `generate()` and `calculate()`. @@ -98,9 +103,10 @@ We could also visualize the observed statistic against the theoretical null dist gss %>% specify(college ~ finrela) %>% assume(distribution = "Chisq") %>% - visualize() + + visualize() + shade_p_value(observed_indep_statistic, - direction = "greater") + direction = "greater" + ) ``` To visualize both the randomization-based and theoretical null distributions to get a sense of how the two relate, we can pipe the randomization-based null distribution into `visualize()`, and further provide `method = "both"`. @@ -108,9 +114,10 @@ To visualize both the randomization-based and theoretical null distributions to ```{r visualize-indep-both, warning = FALSE, message = FALSE} # visualize both null distributions and the test statistic! null_dist_sim %>% - visualize(method = "both") + + visualize(method = "both") + shade_p_value(observed_indep_statistic, - direction = "greater") + direction = "greater" + ) ``` Either way, it looks like our observed test statistic would be quite unlikely if there were actually no association between education and income. More exactly, we can approximate the p-value with `get_p_value`: @@ -118,8 +125,10 @@ Either way, it looks like our observed test statistic would be quite unlikely if ```{r p-value-indep, warning = FALSE, message = FALSE} # calculate the p value from the observed statistic and null distribution p_value_independence <- null_dist_sim %>% - get_p_value(obs_stat = observed_indep_statistic, - direction = "greater") + get_p_value( + obs_stat = observed_indep_statistic, + direction = "greater" + ) p_value_independence ``` @@ -149,8 +158,10 @@ gss %>% ggplot2::aes(x = finrela) + ggplot2::geom_bar() + ggplot2::geom_hline(yintercept = 466.3, col = "red") + - ggplot2::labs(x = "finrela: Self-Identification of Income Class", - y = "Number of Responses") + ggplot2::labs( + x = "finrela: Self-Identification of Income Class", + y = "Number of Responses" + ) ``` It seems like a uniform distribution may not be the most appropriate description of the data--many more people describe their income as average than than any of the other options. Lets now test whether this difference in distributions is statistically significant. @@ -161,13 +172,17 @@ First, to carry out this hypothesis test, we would calculate our observed statis # calculating the null distribution observed_gof_statistic <- gss %>% specify(response = finrela) %>% - hypothesize(null = "point", - p = c("far below average" = 1/6, - "below average" = 1/6, - "average" = 1/6, - "above average" = 1/6, - "far above average" = 1/6, - "DK" = 1/6)) %>% + hypothesize( + null = "point", + p = c( + "far below average" = 1 / 6, + "below average" = 1 / 6, + "average" = 1 / 6, + "above average" = 1 / 6, + "far above average" = 1 / 6, + "DK" = 1 / 6 + ) + ) %>% calculate(stat = "Chisq") ``` @@ -178,13 +193,17 @@ The observed statistic is `r observed_gof_statistic`. Now, generating a null dis # generating a null distribution, assuming each income class is equally likely null_dist_gof <- gss %>% specify(response = finrela) %>% - hypothesize(null = "point", - p = c("far below average" = 1/6, - "below average" = 1/6, - "average" = 1/6, - "above average" = 1/6, - "far above average" = 1/6, - "DK" = 1/6)) %>% + hypothesize( + null = "point", + p = c( + "far below average" = 1 / 6, + "below average" = 1 / 6, + "average" = 1 / 6, + "above average" = 1 / 6, + "far above average" = 1 / 6, + "DK" = 1 / 6 + ) + ) %>% generate(reps = 1000, type = "draw") %>% calculate(stat = "Chisq") ``` @@ -194,9 +213,10 @@ Again, to get a sense for what these distributions look like, and where our obse ```{r visualize-indep-gof, warning = FALSE, message = FALSE} # visualize the null distribution and test statistic! null_dist_gof %>% - visualize() + + visualize() + shade_p_value(observed_gof_statistic, - direction = "greater") + direction = "greater" + ) ``` This statistic seems like it would be quite unlikely if income class self-identification actually followed a uniform distribution! How unlikely, though? Calculating the p-value: @@ -204,8 +224,10 @@ This statistic seems like it would be quite unlikely if income class self-identi ```{r get-p-value-gof, warning = FALSE, message = FALSE} # calculate the p-value p_value_gof <- null_dist_gof %>% - get_p_value(observed_gof_statistic, - direction = "greater") + get_p_value( + observed_gof_statistic, + direction = "greater" + ) p_value_gof ``` @@ -218,17 +240,21 @@ To calculate the p-value using the true $\chi^2$ distribution, we can use the `p pchisq(observed_gof_statistic$stat, 5, lower.tail = FALSE) ``` -Again, equivalently to the theory-based approach shown above, the package supplies a wrapper function, `chisq_test`, to carry out Chi-Squared goodness of fit tests on tidy data. The syntax goes like this: +Again, equivalently to the theory-based approach shown above, the package supplies a wrapper function, `chisq_test()`, to carry out Chi-Squared goodness of fit tests on tidy data. The syntax goes like this: ```{r chisq-gof-wrapper, message = FALSE, warning = FALSE} -chisq_test(gss, - response = finrela, - p = c("far below average" = 1/6, - "below average" = 1/6, - "average" = 1/6, - "above average" = 1/6, - "far above average" = 1/6, - "DK" = 1/6)) +chisq_test( + gss, + response = finrela, + p = c( + "far below average" = 1 / 6, + "below average" = 1 / 6, + "average" = 1 / 6, + "above average" = 1 / 6, + "far above average" = 1 / 6, + "DK" = 1 / 6 + ) +) ``` diff --git a/vignettes/infer.Rmd b/vignettes/infer.Rmd index f87cb048..74df0de9 100644 --- a/vignettes/infer.Rmd +++ b/vignettes/infer.Rmd @@ -17,7 +17,7 @@ options(digits = 4) ### Introduction -`infer` implements an expressive grammar to perform statistical inference that coheres with the `tidyverse` design framework. Rather than providing methods for specific statistical tests, this package consolidates the principles that are shared among common hypothesis tests into a set of 4 main verbs (functions), supplemented with many utilities to visualize and extract value from their outputs. +infer implements an expressive grammar to perform statistical inference that coheres with the tidyverse design framework. Rather than providing methods for specific statistical tests, this package consolidates the principles that are shared among common hypothesis tests into a set of 4 main verbs (functions), supplemented with many utilities to visualize and extract value from their outputs. Regardless of which hypothesis test we're using, we're still asking the same kind of question: is the effect/difference in our observed data real, or due to chance? To answer this question, we start by assuming that the observed data came from some world where "nothing is going on" (i.e. the observed effect was simply due to random chance), and call this assumption our *null hypothesis*. (In reality, we might not believe in the null hypothesis at all---the null hypothesis is in opposition to the *alternate hypothesis*, which supposes that the effect present in the observed data is actually due to the fact that "something is going on.") We then calculate a *test statistic* from our data that describes the observed effect. We can use this test statistic to calculate a *p-value*, giving the probability that our observed data could come about if the null hypothesis was true. If this probability is below some pre-defined *significance level* $\alpha$, then we can reject our null hypothesis. @@ -48,14 +48,14 @@ Each row is an individual survey response, containing some basic demographic inf ### specify(): Specifying Response (and Explanatory) Variables -The `specify` function can be used to specify which of the variables in the dataset you're interested in. If you're only interested in, say, the `age` of the respondents, you might write: +The `specify()` function can be used to specify which of the variables in the dataset you're interested in. If you're only interested in, say, the `age` of the respondents, you might write: ```{r specify-example, warning = FALSE, message = FALSE} gss %>% specify(response = age) ``` -On the front-end, the output of `specify` just looks like it selects off the columns in the dataframe that you've specified. Checking the class of this object, though: +On the front-end, the output of `specify()` just looks like it selects off the columns in the dataframe that you've specified. Checking the class of this object, though: ```{r specify-one, warning = FALSE, message = FALSE} gss %>% @@ -65,7 +65,7 @@ gss %>% We can see that the `infer` class has been appended on top of the dataframe classes--this new class stores some extra metadata. -If you're interested in two variables--`age` and `partyid`, for example--you can `specify` their relationship in one of two (equivalent) ways: +If you're interested in two variables--`age` and `partyid`, for example--you can `specify()` their relationship in one of two (equivalent) ways: ```{r specify-two, warning = FALSE, message = FALSE} # as a formula @@ -87,7 +87,7 @@ gss %>% ### hypothesize(): Declaring the Null Hypothesis -The next step in the `infer` pipeline is often to declare a null hypothesis using `hypothesize()`. The first step is to supply one of "independence" or "point" to the `null` argument. If your null hypothesis assumes independence between two variables, then this is all you need to supply to `hypothesize()`: +The next step in the infer pipeline is often to declare a null hypothesis using `hypothesize()`. The first step is to supply one of "independence" or "point" to the `null` argument. If your null hypothesis assumes independence between two variables, then this is all you need to supply to `hypothesize()`: ```{r hypothesize-independence, warning = FALSE, message = FALSE} gss %>% @@ -103,7 +103,7 @@ gss %>% hypothesize(null = "point", mu = 40) ``` -Again, from the front-end, the dataframe outputted from `hypothesize()` looks almost exactly the same as it did when it came out of `specify()`, but `infer` now "knows" your null hypothesis. +Again, from the front-end, the dataframe outputted from `hypothesize()` looks almost exactly the same as it did when it came out of `specify()`, but infer now "knows" your null hypothesis. ### generate(): Generating the Null Distribution @@ -161,7 +161,7 @@ gss %>% ### Other Utilities -`infer` also offers several utilities to extract the meaning out of summary statistics and distributions---the package provides functions to visualize where a statistic is relative to a distribution (with `visualize()`), calculate p-values (with `get_p_value()`), and calculate confidence intervals (with `get_confidence_interval()`). +infer also offers several utilities to extract the meaning out of summary statistics and distributions---the package provides functions to visualize where a statistic is relative to a distribution (with `visualize()`), calculate p-values (with `get_p_value()`), and calculate confidence intervals (with `get_confidence_interval()`). To illustrate, we'll go back to the example of determining whether the mean number of hours worked per week is 40 hours. @@ -196,7 +196,7 @@ null_dist %>% shade_p_value(obs_stat = obs_mean, direction = "two-sided") ``` -Notice that `infer` has also shaded the regions of the null distribution that are as (or more) extreme than our observed statistic. (Also, note that we now use the `+` operator to apply the `shade_p_value` function. This is because `visualize` outputs a plot object from `ggplot2` instead of a data frame, and the `+` operator is needed to add the p-value layer to the plot object.) The red bar looks like it's slightly far out on the right tail of the null distribution, so observing a sample mean of `r obs_mean` hours would be somewhat unlikely if the mean was actually 40 hours. How unlikely, though? +Notice that infer has also shaded the regions of the null distribution that are as (or more) extreme than our observed statistic. (Also, note that we now use the `+` operator to apply the `shade_p_value()` function. This is because `visualize` outputs a plot object from `ggplot2` instead of a data frame, and the `+` operator is needed to add the p-value layer to the plot object.) The red bar looks like it's slightly far out on the right tail of the null distribution, so observing a sample mean of `r obs_mean` hours would be somewhat unlikely if the mean was actually 40 hours. How unlikely, though? ```{r get_p_value, warning = FALSE, message = FALSE} # get a two-tailed p-value @@ -221,11 +221,13 @@ boot_dist <- gss %>% # start with the bootstrap distribution ci <- boot_dist %>% # calculate the confidence interval around the point estimate - get_confidence_interval(point_estimate = obs_mean, - # at the 95% confidence level - level = .95, - # using the standard error - type = "se") + get_confidence_interval( + point_estimate = obs_mean, + # at the 95% confidence level + level = .95, + # using the standard error + type = "se" + ) ci ``` @@ -240,7 +242,7 @@ boot_dist %>% ### Theoretical Methods -{infer} also provides functionality to use theoretical methods for `"Chisq"`, `"F"`, `"t"` and `"z"` distributions. +infer also provides functionality to use theoretical methods for `"Chisq"`, `"F"`, `"t"` and `"z"` distributions. Generally, to find a null distribution using theory-based methods, use the same code that you would use to find the observed statistic elsewhere, replacing calls to `calculate()` with `assume()`. For example, to calculate the observed $t$ statistic (a standardized mean): @@ -306,7 +308,7 @@ observed_fit <- gss %>% fit() ``` -Now, to generate null distributions for each of these terms, we can fit 1000 models to resamples of the `gss` dataset, where the response `hours` is permuted in each. Note that this code is the same as the above except for the addition of the `hypothesize` and `generate` step. +Now, to generate null distributions for each of these terms, we can fit 1000 models to resamples of the `gss` dataset, where the response `hours` is permuted in each. Note that this code is the same as the above except for the addition of the `hypothesize()` and `generate()` step. ```{r} null_fits <- gss %>% diff --git a/vignettes/observed_stat_examples.Rmd b/vignettes/observed_stat_examples.Rmd index bff1d82d..a8dabe58 100644 --- a/vignettes/observed_stat_examples.Rmd +++ b/vignettes/observed_stat_examples.Rmd @@ -19,9 +19,9 @@ knitr::opts_chunk$set(fig.width = 6, fig.height = 4.5, options(digits = 4) ``` -This vignette is intended to provide a set of examples that nearly exhaustively demonstrate the functionalities provided by `infer`. Commentary on these examples is limited---for more discussion of the intuition behind the package, see the "Getting to Know infer" vignette, accessible by calling `vignette("infer")`. +This vignette is intended to provide a set of examples that nearly exhaustively demonstrate the functionalities provided by infer. Commentary on these examples is limited---for more discussion of the intuition behind the package, see the "Getting to Know infer" vignette, accessible by calling `vignette("infer")`. -Throughout this vignette, we'll make use of the `gss` dataset supplied by `infer`, which contains a sample of data from the General Social Survey. See `?gss` for more information on the variables included and their source. Note that this data (and our examples on it) are for demonstration purposes only, and will not necessarily provide accurate estimates unless weighted properly. For these examples, let's suppose that this dataset is a representative sample of a population we want to learn about: American adults. The data looks like this: +Throughout this vignette, we'll make use of the `gss` dataset supplied by infer, which contains a sample of data from the General Social Survey. See `?gss` for more information on the variables included and their source. Note that this data (and our examples on it) are for demonstration purposes only, and will not necessarily provide accurate estimates unless weighted properly. For these examples, let's suppose that this dataset is a representative sample of a population we want to learn about: American adults. The data looks like this: ```{r load-packages, echo = FALSE} library(dplyr) @@ -145,7 +145,7 @@ null_dist %>% get_p_value(obs_stat = t_bar, direction = "two-sided") ``` -Alternatively, using the `t_test` wrapper: +Alternatively, using the `t_test()` wrapper: ```{r} gss %>% @@ -353,7 +353,7 @@ null_dist %>% get_p_value(obs_stat = p_hat, direction = "two-sided") ``` -The package also supplies a wrapper around `prop.test` for tests of a single proportion on tidy data. +The package also supplies a wrapper around `prop.test()` for tests of a single proportion on tidy data. ```{r prop_test_1_grp} prop_test(gss, @@ -361,7 +361,7 @@ prop_test(gss, p = .2) ``` -`infer` does not support testing two means via the `z` distribution. +infer does not support testing two means via the `z` distribution. ### Two categorical (2 level) variables @@ -378,9 +378,12 @@ d_hat <- gss %>% Alternatively, using the `observe()` wrapper to calculate the observed statistic, ```{r} -d_hat <- gss %>% - observe(college ~ sex, success = "no degree", - stat = "diff in props", order = c("female", "male")) +d_hat <- gss %>% + observe( + college ~ sex, + success = "no degree", + stat = "diff in props", order = c("female", "male") + ) ``` Then, generating the null distribution, @@ -388,8 +391,8 @@ Then, generating the null distribution, ```{r} null_dist <- gss %>% specify(college ~ sex, success = "no degree") %>% - hypothesize(null = "independence") %>% - generate(reps = 1000) %>% + hypothesize(null = "independence") %>% + generate(reps = 1000) %>% calculate(stat = "diff in props", order = c("female", "male")) ``` @@ -407,7 +410,7 @@ null_dist %>% get_p_value(obs_stat = d_hat, direction = "two-sided") ``` -`infer` also provides functionality to calculate ratios of proportions. The workflow looks similar to that for `diff in props`. +infer also provides functionality to calculate ratios of proportions. The workflow looks similar to that for `diff in props`. Calculating the observed statistic, @@ -569,13 +572,17 @@ Note the need to add in the hypothesized values here to compute the observed sta ```{r} Chisq_hat <- gss %>% specify(response = finrela) %>% - hypothesize(null = "point", - p = c("far below average" = 1/6, - "below average" = 1/6, - "average" = 1/6, - "above average" = 1/6, - "far above average" = 1/6, - "DK" = 1/6)) %>% + hypothesize( + null = "point", + p = c( + "far below average" = 1 / 6, + "below average" = 1 / 6, + "average" = 1 / 6, + "above average" = 1 / 6, + "far above average" = 1 / 6, + "DK" = 1 / 6 + ) + ) %>% calculate(stat = "Chisq") ``` @@ -583,15 +590,19 @@ Alternatively, using the `observe()` wrapper to calculate the observed statistic ```{r} Chisq_hat <- gss %>% - observe(response = finrela, - null = "point", - p = c("far below average" = 1/6, - "below average" = 1/6, - "average" = 1/6, - "above average" = 1/6, - "far above average" = 1/6, - "DK" = 1/6), - stat = "Chisq") + observe( + response = finrela, + null = "point", + p = c( + "far below average" = 1 / 6, + "below average" = 1 / 6, + "average" = 1 / 6, + "above average" = 1 / 6, + "far above average" = 1 / 6, + "DK" = 1 / 6 + ), + stat = "Chisq" + ) ``` Then, generating the null distribution, @@ -599,13 +610,17 @@ Then, generating the null distribution, ```{r} null_dist <- gss %>% specify(response = finrela) %>% - hypothesize(null = "point", - p = c("far below average" = 1/6, - "below average" = 1/6, - "average" = 1/6, - "above average" = 1/6, - "far above average" = 1/6, - "DK" = 1/6)) %>% + hypothesize( + null = "point", + p = c( + "far below average" = 1 / 6, + "below average" = 1 / 6, + "average" = 1 / 6, + "above average" = 1 / 6, + "far above average" = 1 / 6, + "DK" = 1 / 6 + ) + ) %>% generate(reps = 1000, type = "draw") %>% calculate(stat = "Chisq") ``` @@ -651,14 +666,18 @@ null_dist %>% Alternatively, using the `chisq_test` wrapper: ```{r} -chisq_test(gss, - response = finrela, - p = c("far below average" = 1/6, - "below average" = 1/6, - "average" = 1/6, - "above average" = 1/6, - "far above average" = 1/6, - "DK" = 1/6)) +chisq_test( + gss, + response = finrela, + p = c( + "far below average" = 1 / 6, + "below average" = 1 / 6, + "average" = 1 / 6, + "above average" = 1 / 6, + "far above average" = 1 / 6, + "DK" = 1 / 6 + ) +) ``` ### Two categorical (\>2 level): Chi-squared test of independence @@ -1154,7 +1173,7 @@ visualize(sampling_dist) + shade_confidence_interval(endpoints = theor_ci) ``` -Note that the `t` distribution is recentered and rescaled to lie on the scale of the observed data. `infer` does not support confidence intervals on means via the `z` distribution. +Note that the `t` distribution is recentered and rescaled to lie on the scale of the observed data. infer does not support confidence intervals on means via the `z` distribution. ### One numerical (one mean - standardized) @@ -1208,7 +1227,7 @@ visualize(boot_dist) + shade_confidence_interval(endpoints = standard_error_ci) ``` -See the above subsection (one mean) for a theory-based approach. Note that `infer` does not support confidence intervals on means via the `z` distribution. +See the above subsection (one mean) for a theory-based approach. Note that infer does not support confidence intervals on means via the `z` distribution. ### One categorical (one proportion) diff --git a/vignettes/paired.Rmd b/vignettes/paired.Rmd index 36a05055..f247126a 100644 --- a/vignettes/paired.Rmd +++ b/vignettes/paired.Rmd @@ -23,7 +23,7 @@ library(infer) In this vignette, we'll walk through conducting a randomization-based paired test of independence with infer. -Throughout this vignette, we'll make use of the `gss` dataset supplied by `infer`, which contains a sample of data from the General Social Survey. See `?gss` for more information on the variables included and their source. Note that this data (and our examples on it) are for demonstration purposes only, and will not necessarily provide accurate estimates unless weighted properly. For these examples, let's suppose that this dataset is a representative sample of a population we want to learn about: American adults. The data looks like this: +Throughout this vignette, we'll make use of the `gss` dataset supplied by infer, which contains a sample of data from the General Social Survey. See `?gss` for more information on the variables included and their source. Note that this data (and our examples on it) are for demonstration purposes only, and will not necessarily provide accurate estimates unless weighted properly. For these examples, let's suppose that this dataset is a representative sample of a population we want to learn about: American adults. The data looks like this: ```{r glimpse-gss-actual, warning = FALSE, message = FALSE} dplyr::glimpse(gss) @@ -56,8 +56,10 @@ gss_paired %>% ggplot2::ggplot() + ggplot2::aes(x = diff) + ggplot2::geom_histogram(bins = diff(range(unique_diff))) + - ggplot2::labs(x = "diff: Difference in Number of Hours Worked", - y = "Number of Responses") + + ggplot2::labs( + x = "diff: Difference in Number of Hours Worked", + y = "Number of Responses" + ) + ggplot2::scale_x_continuous(breaks = c(range(unique_diff), 0)) ``` diff --git a/vignettes/t_test.Rmd b/vignettes/t_test.Rmd index 8ced9211..5ba6f05f 100644 --- a/vignettes/t_test.Rmd +++ b/vignettes/t_test.Rmd @@ -21,9 +21,9 @@ library(infer) ### Introduction -In this vignette, we'll walk through conducting $t$-tests and their randomization-based analogue using `infer`. We'll start out with a 1-sample $t$-test, which compares a sample mean to a hypothesized true mean value. Then, we'll discuss paired $t$-tests, which are a special use case of 1-sample $t$-tests, and evaluate whether differences in paired values (e.g. some measure taken of a person before and after an experiment) differ from 0. Finally, we'll wrap up with 2-sample $t$-tests, testing the difference in means of two populations using a sample of data drawn from them. +In this vignette, we'll walk through conducting $t$-tests and their randomization-based analogue using infer. We'll start out with a 1-sample $t$-test, which compares a sample mean to a hypothesized true mean value. Then, we'll discuss paired $t$-tests, which are a special use case of 1-sample $t$-tests, and evaluate whether differences in paired values (e.g. some measure taken of a person before and after an experiment) differ from 0. Finally, we'll wrap up with 2-sample $t$-tests, testing the difference in means of two populations using a sample of data drawn from them. -Throughout this vignette, we'll make use of the `gss` dataset supplied by `infer`, which contains a sample of data from the General Social Survey. See `?gss` for more information on the variables included and their source. Note that this data (and our examples on it) are for demonstration purposes only, and will not necessarily provide accurate estimates unless weighted properly. For these examples, let's suppose that this dataset is a representative sample of a population we want to learn about: American adults. The data looks like this: +Throughout this vignette, we'll make use of the `gss` dataset supplied by infer, which contains a sample of data from the General Social Survey. See `?gss` for more information on the variables included and their source. Note that this data (and our examples on it) are for demonstration purposes only, and will not necessarily provide accurate estimates unless weighted properly. For these examples, let's suppose that this dataset is a representative sample of a population we want to learn about: American adults. The data looks like this: ```{r glimpse-gss-actual, warning = FALSE, message = FALSE} dplyr::glimpse(gss) @@ -40,8 +40,10 @@ gss %>% ggplot2::ggplot() + ggplot2::aes(x = hours) + ggplot2::geom_histogram(bins = 20) + - ggplot2::labs(x = "hours: Number of Hours Worked", - y = "Number of Responses") + + ggplot2::labs( + x = "hours: Number of Hours Worked", + y = "Number of Responses" + ) + ggplot2::scale_x_continuous(breaks = seq(0, 90, 10)) ``` @@ -60,7 +62,7 @@ observed_statistic <- gss %>% The observed statistic is `r observed_statistic`. Now, we want to compare this statistic to a null distribution, generated under the assumption that the mean was actually 40, to get a sense of how likely it would be for us to see this observed mean if the true number of hours worked per week in the population was really 40. -We can `generate` the null distribution using the bootstrap. In the bootstrap, for each replicate, a sample of size equal to the input sample size is drawn (with replacement) from the input sample data. This allows us to get a sense of how much variability we'd expect to see in the entire population so that we can then understand how unlikely our sample mean would be. +We can `generate()` the null distribution using the bootstrap. In the bootstrap, for each replicate, a sample of size equal to the input sample size is drawn (with replacement) from the input sample data. This allows us to get a sense of how much variability we'd expect to see in the entire population so that we can then understand how unlikely our sample mean would be. ```{r generate-null-1-sample, warning = FALSE, message = FALSE} # generate the null distribution @@ -76,9 +78,10 @@ To get a sense for what these distributions look like, and where our observed st ```{r visualize-1-sample, warning = FALSE, message = FALSE} # visualize the null distribution and test statistic! null_dist_1_sample %>% - visualize() + + visualize() + shade_p_value(observed_statistic, - direction = "two-sided") + direction = "two-sided" + ) ``` It looks like our observed mean of `r observed_statistic` would be relatively unlikely if the true mean was actually 40 hours a week. More exactly, we can calculate the p-value: @@ -157,7 +160,7 @@ The `order` argument in that `calculate` line gives the order to subtract the me Now, we want to compare this difference in means to a null distribution, generated under the assumption that the number of hours worked a week has no relationship with whether or not one has a college degree, to get a sense of how likely it would be for us to see this observed difference in means if there were really no relationship between these two variables. -We can `generate` the null distribution using permutation, where, for each replicate, each value of degree status will be randomly reassigned (without replacement) to a new number of hours worked per week in the sample in order to break any association between the two. +We can `generate()` the null distribution using permutation, where, for each replicate, each value of degree status will be randomly reassigned (without replacement) to a new number of hours worked per week in the sample in order to break any association between the two. ```{r generate-null-2-sample, warning = FALSE, message = FALSE} # generate the null distribution with randomization @@ -194,7 +197,7 @@ p_value_2_sample Thus, if there were really no relationship between the number of hours worked a week and whether one has a college degree, the probability that we would see a statistic as or more extreme than `r observed_statistic` is approximately `r p_value_2_sample`. -Note that, similarly to the steps shown above, the package supplies a wrapper function, `t_test`, to carry out 2-sample $t$-tests on tidy data. The syntax looks like this: +Note that, similarly to the steps shown above, the package supplies a wrapper function, `t_test()`, to carry out 2-sample $t$-tests on tidy data. The syntax looks like this: ```{r 2-sample-t-test-wrapper, message = FALSE, warning = FALSE} t_test(x = gss, @@ -220,7 +223,7 @@ observed_statistic Note that this pipeline to calculate an observed statistic includes `hypothesize()` since the $t$ statistic requires a hypothesized mean value. -Then, juxtaposing that $t$ statistic with its associated distribution using the `pt` function: +Then, juxtaposing that $t$ statistic with its associated distribution using the `pt()` function: ```{r} pt(unname(observed_statistic), df = nrow(gss) - 2, lower.tail = FALSE)*2