Skip to content

Commit

Permalink
update final key to a new problem set for #3 (t-test)
Browse files Browse the repository at this point in the history
  • Loading branch information
bleds22e committed Nov 29, 2023
1 parent 40be17b commit 55c8c3a
Show file tree
Hide file tree
Showing 3 changed files with 48 additions and 23 deletions.
71 changes: 48 additions & 23 deletions modules/final_project/FinalProject_key.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ output: pdf_document
knitr::opts_chunk$set(echo = TRUE)
```

# Final Project Details
# Final Details

### Purpose

Expand All @@ -33,9 +33,9 @@ Write R code which produces the correct data, summaries, plots and analyses. Cor

Dec 11 at midnight MST

# Final Project
# Final

For your final "project" this semester, I am presenting you with 3 problem sets, totaling up to 60 points.
For your final this semester, I am presenting you with 3 problem sets, totaling up to 60 points.

I'm expecting you to be able to filter and summarize the data in ways you need, choose the appropriate visualization, choose the appropriate analysis, and correctly interpret the analysis for the question I've asked you.

Expand Down Expand Up @@ -69,6 +69,8 @@ Once you've run this line of code, you should see the `penguins` data frame pop
penguins <- penguins %>% drop_na()
```

![](lter_penguins.png){width="50%"}

------------------------------------------------------------------------

## Structure & Guidelines
Expand Down Expand Up @@ -313,29 +315,52 @@ summary(lm(data = biscoe, bill_depth_mm ~ bill_length_mm * species))

## Problem Set 3 (15 points)

#### Question: Is there a difference in the average flipper length between penguin species on Dream island?
#### Question: Is there a difference in the average flipper length between male and female Chinstrap penguins?

1. Our first step is to create a new data frame that includes only the individuals found on Biscoe island. Call this new data frame `dream`. (1 point)
1. Our first step is to create a new data frame that includes only Chinstrap penguins. Call this new data frame `chinstrap`. (1 point)

```{r}
dream <- penguins %>%
filter(island == "Dream")
chinstrap <- penguins %>%
filter(species == "Chinstrap")
```

We will be using the `dream` data frame for the rest of this problem set.
We will be using the `chinstrap` data frame for the rest of this problem set.

3. Let's summarize our data. Calculate one measure of central tendancy and one (complete) measure of variability of the flipper length column for *each* species. (2 points)
2. Let's summarize our data. Calculate one measure of central tendency and one (complete) measure of variability of the flipper length column for *each* sex: male and female. Save this dataframe as `chinstrap_summary`. (2 points)

```{r}
dream %>%
group_by(species) %>%
chinstrap_summary <- chinstrap %>%
group_by(sex) %>%
summarise(mean_flipper = mean(flipper_length_mm),
sd_flipper = sd(flipper_length_mm))
```

Ok, we have summarized the flipper length data for our two groups.
3. What if we wanted our summary data in centimeters instead of milimeters?

<!-- -->

a. First, create a function that will convert a number from milimeters to centimeters. (1 point)

```{r}
mm_to_cm <- function(mm) {
cm <- mm / 10
return(cm)
}
```

b. Now, using the same code from question 3 above, but add one line (in the correct location) that uses your newly created `mm_to_cm` function and produces the same data frame but with the summary values in cm instead of mm. Made sure to edit the summary functions accordingly, as well. (1 point)

```{r}
chinstrap_summary <- chinstrap %>%
mutate(flipper_length_cm = mm_to_cm(flipper_length_mm)) %>%
group_by(sex) %>%
summarise(mean_flipper = mean(flipper_length_cm),
sd_flipper = sd(flipper_length_cm))
```

Ok, we have summarized the flipper length data for our two groups! Let's get back to the `chinstrap` dataframe (not the summary data frame), and keep working.

4. Determine which variable is dependent and which is independent. Also determine if each variable is continuous or categorical. (2 points)
4. Determine which variable is dependent and which is independent. Also determine if each variable is continuous or categorical. (1 points)

- **flipper length**: dependent, continuous
- **species**: independent, categorical
Expand All @@ -345,40 +370,40 @@ Let's plot the body mass data for the two groups.
5. Choose an appropriate plot for data with one continuous variable and one categorical variable (there are a few options). Be sure to adjust the x- and y-axis labels appropriately. (2 points)

```{r}
ggplot(dream, aes(species, flipper_length_mm)) +
ggplot(chinstrap, aes(sex, flipper_length_mm)) +
geom_boxplot() +
geom_jitter(width = 0.1, alpha = 0.5) +
labs(x = "Species",
labs(x = "Sex",
y = "Flipper Length (mm)") +
theme_bw()
# OR
ggplot(dream, aes(flipper_length_mm, fill = species)) +
ggplot(chinstrap, aes(flipper_length_mm, fill = sex)) +
geom_histogram(alpha = 0.5, position = "identity") +
labs(x = "Flipper Length (mm)",
y = "Frequency",
fill = "Species") +
fill = "Sex") +
theme_bw()
# OR
ggplot(dream, aes(flipper_length_mm, fill = species)) +
ggplot(chinstrap, aes(flipper_length_mm, fill = sex)) +
geom_density(alpha = 0.5) +
labs(x = "Flipper Length (mm)",
y = "Density",
fill = "Species") +
fill = "Sex") +
theme_bw()
```

6. Write the pair of statistical hypotheses for our question. (2 points)
6. Write the pair of statistical hypotheses for our question. (1 points)

**Null**: no difference in the mean body mass between penguins with short flippers and long flippers **Alternative**: true difference in the mean body mass between penguins with short flippers and long flippers
**Null**: no difference in the mean flipper length between male and female chinstrap penguins **Alternative**: true difference in the mean flipper length between male and female chinstrap penguins

7. Perform the appropriate analysis to compare the flipper lengths of each species. (2 points)

```{r}
t.test(data = dream, flipper_length_mm ~ species)
t.test(data = chinstrap, flipper_length_mm ~ sex)
```

8. Interpret the results of this test. (2 points)
Expand All @@ -387,7 +412,7 @@ t.test(data = dream, flipper_length_mm ~ species)
- what does that significant difference mean?
- should we reject the null hypothesis?

*Answer: yes, there is a significant difference (p = 4.724 x 10\^-6); reject null*
*Answer: yes, there is a significant difference (p = 2.535 x 10\^-7); reject null*

9. Should we run pairwise comparisons? If no, explain why not. If yes, do so and interpret the results. (2 points)

Expand Down
Binary file modified modules/final_project/FinalProject_key.pdf
Binary file not shown.
File renamed without changes

0 comments on commit 55c8c3a

Please sign in to comment.