AlexsLemonade · cansavvy · Mar 4, 2021 · Mar 2, 2021 · Mar 2, 2021 · Mar 2, 2021
diff --git a/components/dictionary.txt b/components/dictionary.txt
@@ -1,3 +1,4 @@
+Adelie
 aes
 al
 Alboukadel

diff --git a/intro-to-R-tidyverse/01-intro_to_base_R.Rmd b/intro-to-R-tidyverse/01-intro_to_base_R.Rmd
@@ -383,12 +383,19 @@ question_values %in% values_1_to_20
 
 ## Data frames
 
-_Data frames are the most fundamental unit of data analysis in R._ 
+_Data frames are one of the most useful tools for data analysis in R._ 
 They are tables which consist of rows and columns, much like a _spreadsheet_. 
 Each column is a variable which behaves as a _vector_, and each row is an observation. 
-We will begin our exploration with the old trusted dataset `iris`, which comes with R. 
-Learn about this dataset using the standard help approach of `?iris`.
+We will begin our exploration with dataset about penguins from the [`palmerpenguins` package](https://allisonhorst.github.io/palmerpenguins/). 
+To use this dataset, we will need to extract it from the `palmerpenguins` using a `::` (more on this later).
-To use this dataset, we will need to extract it from the `palmerpenguins` using a `::` (more on this later).
+To use this dataset, we will load it from the `palmerpenguins` package using a `::` (more on this later) and assign it to a variable named `penguins` in our current environment.
-To use this dataset, we will need to extract it from the `palmerpenguins` using a `::` (more on this later).
+To use this dataset, we will load it from the `palmerpenguins` package using a `::` (more on this later) and assign it to a variable named `penguins` in our current environment.
 
+```{r penguin-library}
+penguins <- palmerpenguins::penguins
+```
+
+`penguins` is a data frame with measurements and information on penguins of three different species.
+
+![](diagrams/lter_penguins.png)
 ### Exploring data frames
 
 The first step to using any data is to look at it!!! 
@@ -407,54 +414,54 @@ We can additionally explore _overall properties_ of the data frame with two diff
 
 This provides summary statistics for each column:
 
-```{r iris-summary}
-summary(iris)
+```{r penguins-summary}
+summary(penguins)
 ```
 
 This provides a short view of the **str**ucture and contents of the data frame.
 
-```{r iris-str}
-str(iris)
+```{r penguins-str}
+str(penguins)
 ```
 
-You'll notice that the column `Species` is a _factor_: This is a special type of character variable that represents distinct categories known as "levels". 
-We have learned here that there are three levels in the `Species` column: setosa, versicolor, and virginica. 
+You'll notice that the column `species` is a _factor_: This is a special type of character variable that represents distinct categories known as "levels". 
+We have learned here that there are three levels in the `species` column: Adelie, Chinstrap, and Gentoo.
 We might want to explore individual columns of the data frame more in-depth. 
 We can examine individual columns using the dollar sign `$` to select one by name:
 
-```{r iris-subset}
-# Extract Sepal.Length as a vector
-iris$Sepal.Length
+```{r penguins-subset}
+# Extract bill_length_mm as a vector
+penguins$bill_length_mm
 
 # indexing operators can be used too
-iris$Sepal.Width[1:10]
+penguins$bill_depth_mm[1:10]
 ```
 
 We can perform our regular vector operations on columns directly.
 
-```{r iris-col-mean, live = TRUE}
-# calculate the mean of the Sepal.Length column
-mean(iris$Sepal.Length)
+```{r penguins-col-mean, live = TRUE}
+# calculate the mean of the bill_length_mm column
+mean(penguins$bill_length_mm)
 ```
 
 We can also calculate the full summary statistics for a single column directly. 
 
-```{r iris-col-summary, live = TRUE}
-# show a summary of the Sepal.Length column
-summary(iris$Sepal.Length)
+```{r penguins-col-summary, live = TRUE}
+# show a summary of the bill_length_mm column
+summary(penguins$bill_length_mm)
 ```
 
 Extract `Species` as a vector and subset it to see a preview.
 
-```{r iris-col-subset, live = TRUE}
+```{r penguins-col-subset, live = TRUE}
 # get the first 10 values of the Species column
-iris$Species[1:10]
+penguins$species[1:10]
 ```
 
 And view its _levels_ with the `levels()` function.
 
 ```{r}
-levels(iris$Species)
+levels(penguins$species)
 ```
 
 ## Files and directories
-Original file line number
+Diff line change
@@ -1,3 +1,4 @@
+    Adelie
     aes
     al
     Alboukadel
@@ Expand Down @@