-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathnba_challenge.Rmd
65 lines (41 loc) · 2.14 KB
/
nba_challenge.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Read in the data
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
What is the date range of this season?
What is the range of points scored in a game by one team? Which team scored the most points in a game? Which team scored the fewest?
Calculate the regular season (`seasTyp`) record (win/loss) for each team (hint: use `group_by()` and `summarize()` or `count()`)
```{r}
```
Which team had the best record? Which had the worst record?
If that was too easy, then calculate the record per month. Which team had the best month?
Calculate a new variable, which is win% for each team. Then plot win % vs. average points per game (`teamPTS`)
```{r}
```
If that's too easy, then plot win % vs. average point differential (`teamPTS - opptPTS`)
```{r}
```
Now calculate wins/losses each for home and away games (`teamLoc`). Who had the best home record? Who had the best away record? Can you plot Home vs. Away win % (or number of wins)?
```{}
```
Make that an interactive plotly plot. Try editing the `tooltip` text to show you the team for each data point.
```{r}
```
Now make a plot or two in plotly using the `plot_ly()` function. Try both univariate and multivariate plots. For help, see [this cheat sheet](https://images.plot.ly/plotly-documentation/images/r_cheat_sheet.pdf)
```{}
```
Hard challenge: use plotly or gganimate to show how the home and away win percentages changed from month to month.
```{r}
```
Can you run a PCA on the mean average win/loss data? Hint: to start you can use `summarize all` and `select_if`
Plot the first two PCs and see if you can color it by win/loss. What are the variables that load strongly onto PC1 and PC2?
```{r}
nba_mean=nba %>% filter(seasTyp=="Regular") %>% group_by(teamAbbr,teamRslt) %>% summarize_all(mean,na.rm=T) %>% select_if(~sum(is.na(.)) == 0)
```
more challenges: What about doing this on a game-by-game basis? What variables load strongly onto PC1 and PC2? Does it predict a win/loss for the focal team?
super hard challenge: Plot win percentages on a north american map using the team city as the location.
```{r}
```
only slightly super harder challenge: make your map interactive
```{r}
```