-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathae-10b-lifecycle.qmd
146 lines (114 loc) · 4.85 KB
/
ae-10b-lifecycle.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
title: "Pair Programming - Data Science Lifecycle"
format: html
editor: visual
bibliography: references.bib
execute:
echo: true
message: false
warning: false
editor_options:
chunk_output_type: console
---
# Task 0: Load R packages
The required R Packages are loaded at the beginning of the script.
1. Run the code contained in the code-chunk and fix any errors. (**Tipp: Green play button in the top right corner of the code-chunk.**)
```{r}
library(readr)
library(dplyr)
library(ggplot2)
```
# Task 1: Data import
**Fill in the blanks**
1. A code-chunk has already been created
2. Import the CSV file titled 'country_level_data_0.csv', contained in the 'raw_data' directory with help of the `read_csv()` function
3. Use the assignment operator (`<-`) to assign the data to an object named `global_waste_data`
4. Run the code contained in the code-chunk and fix any errors
5. Next to the code-chunk option `#| eval:` change the value from `false` to `true` (**Info: This tells R to eval(uate) the code-chunk on render**)
6. **Render:** Render this file to HTML
7. **Add:** Open the Git pane and add all files to the staging area (**Tipp: Tick off all checkboxes under column 'Staged'**)
8. **Commit:** Commit pending changes; add a meaningful commit message
9. **Push:** Push changes to the remote repository (i.e. GitHub)
```{r}
#| eval: false
___ <- read_csv("___")
```
# Task 2: Data tidying (and some transformation)
**Fill in the blanks**
1. A code-chunk has already been created
2. Start with the `global_waste_data` object
3. Add the pipe operator (`%>%`) and on a new line use the `select()` function
4. Inside the parentheses write the names of the following variables:
- country_name
- iso3c
- income_id
- total_msw_total_msw_generated_tons_year
- population_population_number_of_people
5. Add the pipe operator (`%>%`) and on a new line use the `rename()` function
6. Rename two variables:
- (1) from 'total_msw_total_msw_generated_tons_year' to 'msw_tons_year'
- (2) from 'population_population_number_of_people' to 'population'
7. Use the assignment operator (`<-`) to assign the data to an object named `global_waste_data_small`
8. Run the code contained in the code-chunk and fix any errors
9. Next to the code-chunk option `#| eval:` change the value from `false` to `true`
10. Render
11. Add, Commit, Push
```{r}
#| eval: false
global_waste_data_small <- global_waste_data %>%
select(country_name,
iso3c,
___,
___,
population_population_number_of_people) ___
rename(___ = total_msw_total_msw_generated_tons_year,
___ = population_population_number_of_people)
```
# Task 3: Data transformation
**Fill in the blanks**
1. A code-chunk has already been created
2. Use the `global_waste_data_kg_year` object as data for the visualisation
3. Add the pipe operator (`%>%`) and on a new line use the `mutate()` function
4. Create a new variable named 'capita_kg_year' by dividing 'msw_tons_year' by 'population' and multiplied by [?]
5. Run the code contained in the code-chunk and fix any errors
6. Next to the code-chunk option `#| eval:` change the value from `false` to `true`
7. Render
8. Add, Commit, Push
```{r}
#| eval: false
global_waste_data_kg_year <- ___ %>%
___(___ = msw_tons_year / population * ___) %>%
mutate(income_id = factor(income_id,
levels = c("HIC", "UMC", "LMC", "LIC")))
```
# Task 4: Data visualisation
**Fill in the blanks**
1. A code-chunk has already been created
2. Use the `global_waste_data_kg_year` object
3. Use aesthetic mappings to plot the income category on the x-axis and MSW generation per capita on the y-axis
4. Run the code contained in the code-chunk and fix any errors
5. Use a search engine of your choice to figure out how to remove the legend from the plot (**Tipp:** Add the name of the ggplot2 package to your query and focus on results from StackOverflow)
6. Run the code contained in the code-chunk and fix any errors
7. Next to the code-chunk option `#| eval:` change the value from `false` to `true`
8. Render
9. Add, Commit, Push
```{r}
#| eval: false
ggplot(data = global_waste_data_kg_year,
mapping = aes(x = ___,
y = ___,
color = income_id)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.1, alpha = 1/4, size = 3) +
labs(x = "income category",
y = "MSW generation per capita [kg/yr]") +
theme_minimal(base_size = 14)
```
# Task 5: Data communication
1. In the YAML header, replace `html` with `gfm` (GitHub Flavoured Markdown)
2. Render, and close the pop-up window that opens
3. Add, Commit, Push
4. Open your GitHub repository for this exercise, and click on the file with the `.md` ending (`ae-10b-lifecycle.md`)
# Task 6: Complete assignment
1. Open an issue on the repo for this exercise to let us know you completed it. Use the @larnsce mention.
# References