forked from UofTCoders/rcourse
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathassignment-02.Rmd
105 lines (77 loc) · 4.04 KB
/
assignment-02.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
title: 'Assignment 2: Base R (8 marks)'
output:
html_document:
toc: false
---
*To submit this assignment, upload the full document on blackboard,
including the original questions, your code, and the output. Submit
you assignment as a knitted `.pdf` (prefered) or `.html` file.*
1. Variable assignment (1 mark)
a. Assign the value `5` to the variable/object `a`. Display `a`.
(0.25 marks)
b. Assign the result of `10/3` to the variable `b`. Display `b`.
(0.25 marks)
c. Assign the product of `a` and `b` to `c`. Display `c`. (0.25
marks)
d. Write a function that adds two numbers and returns their sum.
Use it to assign the sum of `a` and `b` to `c`. Display `c`. (In
practice, there is already a more sophisticated built-in
function for this: `c <- sum(a, b)`) (0.25 marks)
2. Vectors (1 mark)
a. Create a vector `v` with all integers 0-30, and a vector `w`
with every third integer in the same range. (0.25 marks)
b. What is the difference in lengths of the vectors `v` and `w`?
(0.25 marks)
c. Create a new vector, `v_square`, with the square of elements 3,
6, 7, 10, 15, 22, 23, 24, and 30 from the variable `v`. *Hint:
Use indexing rather than a for loop.* (0.25 marks)
d. Calculate the mean and median of the first five values from
`v_square`. (0.25 marks)
3. Boolean indexing (1 mark)
a. Create a boolean vector `v_bool`, indicating which vector `v`
elements are bigger than 20. How many values are over 20? *Hint:
In R, TRUE = 1, and FALSE = 0, so you can use simple arithmetic
to find this out.* (0.5 marks)
b. Use the variable `v_bool` as an index to extract the elements
from `v` that are bigger than 20. What are the min and max
values of this new vector? (0.5 marks)
4. Data frames (2 marks)
a. There are many built-in data frames in R, which you can find
[more details about
online](https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html).
What are the column names of the built-in dataframe `beaver1`?
How many observations (rows) and variables (columns) are there?
(0.5 marks)
b. How can you view the first 5 rows of this data frame? How can
you view the last 5 rows? (0.5 marks)
c. What is the min, mean, and max body temperatue in this data set?
*Hint: Remember that each column in a data frame is a vector, so
you can use the same functions as in the previous question on
vectors.* (0.5 marks)
d. Use the `summary` function to display an overview of the `temp`
column. (0.5 marks)
5. Data frames with dplyr (3 marks)
a. Use dplyr to calculate the mean temperature in the dataset.
(0.25 marks)
b. Calculate the mean temperature for day 346. (0.5 marks)
c. Rather than using `filter()` to calculate the mean for each day
separately, the more convenient `group_by()` can be used to
aggregate measurements by a categorical value (such as the `day`
column in `beaver`). Use this approach to calculate the mean
temperature and activity level for each of the days in the
dataset. (0.5 marks)
d. Express in writing what the average activity level from the
above calculation means. *Hint: Remember that you can [read a
description of the columns
online](https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html).*
(0.25 marks)
e. How many observations are there per day in this dataset? (0.25
marks)
f. How many observations are there per day when the beaver is
active outside the retreat? (0.25 marks)
g. Was the body temperature higher when the beaver was active or
inactive? Is this what you would expect? (0.25 marks)
h. Grouping by activity level *and* the day of the observation.
Which variable seems to be more related to high body
temperature: activity level or day of measurment? (0.25 marks)