forked from spatialanalysis/workshop-notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
04-gis-3.Rmd
142 lines (86 loc) · 4.49 KB
/
04-gis-3.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# Multiple-Dataset GIS Operations / Visualization pt. 2
## Learning Objectives
- Combine two datasets with spatial join
- Perform spatial aggregation (point in polygon)
- Manipulate data with dplyr
- Save a ggplot image
## Functions Learned
- `st_join()`
- `select()`
- `count()`
- `arrange()`
- `st_geometry()`
- `ggsave()`
```{block type="rmdinfo"}
Hint: For each new function we go over, type `?` in front of it in the console to pull up the help page.
```
## Interactive Tutorial
```{block type="rmdinfo"}
This workshop's script can be found [here](https://github.com/spatialanalysis/workshop-scripts/blob/master/gis-visualization/spring-2019/R/spatial-join.R).
```
## Challenges
We've been reading shapefiles that we've downloaded, but we call also read data directly from a website using an "API". These are often great ways to get data without having to manually download it.
We're going to read data from the Chicago Data Portal:
- [Libraries](https://data.cityofchicago.org/Education/Libraries-Locations-Hours-and-Contact-Information/x8fc-8rcq) points
- [Community Area](https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Community-Areas-current-/cauq-8yn6) polygons
Click on the "API" button to directly access the data, rather than having to download a csv via "Export".
```{block type="learncheck"}
**Challenge**
```
0. Which one of these is a geographic data format?
![](figs/libraries.png)
```{block type="learncheck"}
```
```{block type="learncheck"}
**Challenge**
```
1. Fill in the following script:
```{r}
# Load libraries for use
areas <- st_read("https://data.cityofchicago.org/resource/igwz-8jzy.geojson")
# Read in libraries and areas data
# Project both
# Make a ggplot with libraries and community areas
```
```{block type="learncheck"}
```
```{block type="learncheck"}
**Challenge**
```
2. Which community areas have no libraries? Use `st_intersects` and `filter` to make a map.
```{r}
# Load library with filter() in it
# Find which areas intersect with libraries and save as a variable called "intersects"
# Filter areas by *without* libraries. Save as a variable called "no_lib" Hint: use "==" instead of ">" in the logical comparison
# Make a ggplot with libraries, community areas, and community areas without libraries
```
```{block type="learncheck"}
```
```{block type="rmdwarning"}
The order in which you give arguments to `st_intersects` matters! I always have to look it up, but for point-in-polygon, you want the polygon first, then the points.
```
One question you may be asking yourself is, how many libraries are in each area?
We can tackle this with an operation known as a *spatial join*. What we do is join information about the polygons to the points, so we have for each point which community area it's in. More formally, we are adding the attributes of a layer to the other one.
```{block type="rmdwarning"}
A **spatial join** is not the same as an **attribute join**, which is based on common column (attribute) values between two datasets. Spatial joins are based on a *spatial relationship*: is this point inside this polygon?
You can try doing an **attribute join** on community area number/name with this [Public Health dataset](https://data.cityofchicago.org/resource/iqnk-2tcu.csv), and the command `left_join()` from `dplyr`.
```
The syntax is generally as follows, for point-in-polygon:
> `st_join(point_sf, poly_sf)`
### A simple example
![](figs/point_in_poly.png)
We can spatial join just one attribute, or a few. We can use `select()` to choose attributes.
One we've done our spatial join, we can manipulate the data with `count()` and `arrange()` to figure out which community areas have the most libraries. This is also known as spatial aggregation.
If work with the spatial data gets too clumsy or slow, we can drop the geometry column with `st_geometry()<-`.
### Saving your plots
We ran out of time for this last time, but to save a ggplot image, you can use `ggsave()` after a ggplot2 command. You can adjust the width and height of the image in arguments to the function.
```{block type="learncheck"}
**Challenge**
```
Save one of the plots we've made in this workshop to `figs/name-of-plot.png`.
```{block type="learncheck"}
```
## Links
All the links in this workshop:
- Link to Chicago Libraries data: https://data.cityofchicago.org/Education/Libraries-Locations-Hours-and-Contact-Information/x8fc-8rcq
- Link to Chicago Community Areas data: https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Community-Areas-current-/cauq-8yn6