-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathmodule2.Rmd
103 lines (79 loc) · 1.89 KB
/
module2.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---
title: "Bio-494 Week 1"
author: "Noah"
output: pdf_document
---
## HEADER 1
### HEADER 2
**BOLD**
*ITALICS*
this is a list
* item 1
* item 2
* item 3
Any text you write outside of code "chunks" is just text.
It is how you annotate the text.
lines 20-22 are a chunk of R code, bookended by the three back-ticks
```{r}
library(tidyverse)
```
Here I am reading in a file.
```{r}
movies_imdb=read_delim("movies/movies_imdb.txt",delim=",")
movies_rottentom=read_delim("movies/movies_rottentom.txt",delim=",")
```
## Including Plots
You can also embed plots, for example:
```{r}
movies_imdb %>% select(movie_title,title_year,duration,imdb_score)
```
```{r}
imdb_minimal=movies_imdb %>% select(movie_title,title_year,duration,imdb_score)
```
```{r}
imdb_minimal=movies_imdb %>% select(movie_title,title_year,duration,imdb_score)
joined=full_join(imdb_minimal,movies_rottentom,by=c("movie_title"="title"))
```
ggplot data
```{r}
plot_imdb = ggplot(movies_imdb)
summary(plot_imdb)
```
ggplot aesthetics
```{r}
plot_imdb = ggplot(movies_imdb) + aes(x=title_year,y=imdb_score)
summary(plot_imdb)
```
Add layers to a ggplot object with +
```{r}
plot_imdb = ggplot(movies_imdb)
plot_imdb = plot_imdb + aes(x=title_year,y=imdb_score)
summary(plot_imdb)
```
ggplot geoms
```{r}
plot_imdb = plot_imdb + geom_point()
summary(plot_imdb)
plot_imdb
```
back to the slides for a second.
some nice default themes
```{r}
plot_imdb = plot_imdb + theme_classic(base_size = 20)
plot_imdb
```
and axis labels
```{r}
plot_imdb + xlab("movie release year") + ylab("IMDB score")
plot_imdb + xlab("movie release\n(year)") + ylab("IMDB score")
plot_imdb=plot_imdb + xlab("movie release year") + ylab("IMDB score")
```
ggplot scale
```{r}
plot_imdb + scale_y_continuous(limits=c(0,10))+scale_x_continuous(limits=c(1975,2010))
```
ggplot statistics
```{r}
plot_imdb + stat_smooth(method="lm")
plot_imdb + stat_smooth(method="lm",se=F)
```