-
Notifications
You must be signed in to change notification settings - Fork 17
/
Copy pathrasters2_analysis.Rmd
282 lines (213 loc) · 12.8 KB
/
rasters2_analysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
---
title: "EDS Raster workshop 2: analyzing with rasters"
author: "Casey O'Hara"
date: "11/2/2020"
output:
html_document:
code_folding: show
toc: true
toc_depth: 3
toc_float: yes
number_sections: false
theme: cerulean
highlight: haddock
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, messages = FALSE, warnings = FALSE)
### load packages. Since raster has a "select" function but we more often
### use the tidyverse::select, load raster first, and let tidyverse overwrite
### that function.
library(raster)
library(tidyverse)
library(sf)
library(fasterize)
library(here)
```
# Summary
The previous script focused on creating and developing rasters in preparation for analysis. Here we'll focus on different ways you can use rasters to perform an analysis.
``` {r copy plot_rast utility function from first script}
### The above lines of code to plot the raster with the sf
### are coded into a function here, for future ease of use:
plot_rast <- function(rast, sf, palette = 'viridis', outline = 'red') {
### function to plot a raster with a simple features border overlaid
rast_crs <- as.character(crs(rast))
sf_crs <- as.character(crs(sf))
pal <- hcl.colors(n = 50, palette = palette)
fill_name <- names(rast)
### convert raster to points
rast_df <- rasterToPoints(rast) %>%
as.data.frame() %>%
setNames(c('x', 'y', 'z'))
### ggplot the raster points and the simple features
x <- ggplot() +
geom_raster(data = rast_df, aes(x, y, fill = z)) +
scale_fill_gradientn(colors = pal) +
geom_sf(data = sf, color = outline, fill = NA) +
labs(fill = fill_name)
return(x)
}
moz_eez_sf <- read_sf(here('_spatial/moz_eez/se_afr_eez.shp')) %>%
filter(id == 5)
```
# Basic raster math
Since rasters are just a grid of numeric values, essentially a matrix, we can do the same operations on a raster as we can with a basic vector or matrix. Here we'll walk through some simple but useless examples based on the rasters we already created.
```{r}
### operating on a vector
vec <- 1:10
vec * 2 ### multiply by a scalar
vec + 20 ### add a scalar
eez_rast <- raster(here('_spatial/se_afr_eez_6933.tif'))
plot(eez_rast)
plot(eez_rast * 2)
plot(10 - eez_rast)
plot(eez_rast^3)
plot(log(eez_rast))
```
We can also add, subtract, multiply, or divide by a raster with the same parameters.
```{r}
vec2 <- 100:109
vec + vec2
ss_rast <- raster(here('_spatial/ss_rast_6933.tif'))
plot(ss_rast)
plot(eez_rast/10 + ss_rast)
```
You can also use indexing to change value in one raster based on the value in another raster. Use the `raster::values()` function to basically convert the raster values into a vector that's less ambiguous to work with. We already did this to fill in gaps in the silky shark range.
```{r}
x <- values(eez_rast)
unique(x) ### each number corresponds to an EEZ; Mozambique is 5
ss_rast2 <- ss_rast ### copy it over
### keep all shark probabilities in Moz, but set all outside the Moz EEZ to 0
values(ss_rast2)[values(eez_rast != 5) & !is.na(values(ss_rast))] <- 0
plot(ss_rast2)
### could also set non-Moz EEZ values to NA, basically applying a mask.
```
How can you use this basic raster math?
* basic combinations of rasters to quickly calculate ratios, sums, etc.
* rescale a raster by dividing by the max value `rast_rescale <- rast / maxValue(rast)`
* use a raster of presence/absence (as ones and zeros) to turn on or turn off values in a different raster (similar to a mask)
* "flatten" a raster to all ones and NAs by dividing the raster by itself (something divided by itself = 1, 0 divided by 0 = NaN)
```{r}
hcaf_rast <- raster(here('_spatial/hcaf_rast_6933.tif'))
hcaf_rescaled <- hcaf_rast / maxValue(hcaf_rast)
plot_rast(hcaf_rescaled, moz_eez_sf)
```
# `raster::calc()`
The `calc` function is handy for more complex calculations. It is especially useful when you have a stack of raster layers and want to calculate across them all at once. A raster `stack` object is just multiple layers stacked together. A raster `brick` is similar. At a basic level where we are now, the difference is pretty unimportant.
```{r}
### tell it a vector of files to include in the stack.
rast_files <- list.files(here('_spatial'), pattern = '_6933.tif', full.names = TRUE)
rast_stack <- raster::stack(rast_files)
plot(rast_stack)
### or tell it which rasters already in memory to include in the stack.
rast_stack <- stack(eez_rast, ss_rast, hcaf_rescaled)
rast_prod <- calc(rast_stack, fun = prod, na.rm = TRUE)
plot_rast(rast_prod, moz_eez_sf, palette = 'inferno')
```
# Distance
Sometimes we need to know the distance from one cell to an important feature. For example, how far offshore is a particular cell? How far is a cell from the nearest port? etc. The `raster::distance()` function is super handy for that. It calculates a distance for every NA cell to the nearest non-NA cell.
```{r}
port_rast <- raster(here('_spatial/moz_ports_6933.tif'))
port_dist_rast <- raster::distance(port_rast)
plot(port_dist_rast)
port_df <- port_rast %>%
rasterToPoints() %>%
as.data.frame()
### all cells are given a value, but we only care about EEZ cells:
port_dist_rast <- port_dist_rast %>% mask(eez_rast)
plot_rast(port_dist_rast, moz_eez_sf, palette = 'Berlin') +
geom_point(data = port_df, aes(x, y), color = 'green', size = 2)
```
# Reclassify
`raster::reclassify` is similar to `subs()` in that you are replacing existing values with new values. Here, however, we can identify a range of values to replace with a new single value. This is great for categorizing a raster. For example, instead of a probability of 0-100%, maybe we would like to communicate "high", "medium", "low" probabilities (scored as 1, 2, 3 - a raster can only have numeric data, not character). We need to create a reclassification matrix to do this.
```{r}
### a vector:
### from to class
m <- c(0.00, 0.33, 1,
0.33, 0.67, 2,
0.67, 1.00, 3)
### wrap the vector into a matrix by rows
rcl_mtx <- matrix(m, ncol=3, byrow=TRUE)
plot_rast(ss_rast, moz_eez_sf)
### reclassify the silky shark map using this
ss_reclass <- reclassify(ss_rast, rcl_mtx)
{
plot(ss_reclass, legend = FALSE, col = hcl.colors(n = 3));
legend("topright", legend = c("low", "med", "high"), fill = hcl.colors(n = 3))
}
```
# Zonal statistics
The `raster::zonal()` function lets you easily calculate summary statistics of one raster, based on zones in another. For example, if we wanted to calculate the mean probability of finding silky sharks in any particular cell within the various SE African EEZs, `zonal` is our tool. It returns a matrix.
```{r}
mean_prob <- zonal(x = ss_rast, z = eez_rast, fun = 'mean', na.rm = TRUE)
mean_prob_df <- as.data.frame(mean_prob) ### data.frame class, to use in subs()
### Let's substitute these into the EEZ raster to map out the mean probabilities
mean_prob_rast <- subs(eez_rast, mean_prob_df, by = 'zone', which = 'mean')
plot_rast(mean_prob_rast, moz_eez_sf, palette = 'Zissou 1')
```
# Rasters as dataframes
Sometimes you just love the tidyverse so much you never want to leave. So we can take advantage of the fact that a raster is just a vector of numbers mapped to a spatial grid, and put it into a dataframe format where we can use mutate, group_by, summarize, etc. Operations with the raster format are generally pretty efficient and fast. But if working with very sparse data (e.g., species ranges that are small relative to the size of the global oceans), it may make sense to work with the data in dataframes and csvs, and just assign values spatially at the end.
Here we will create a cell ID raster at the reprojected Mozambique CRS (EPSG 6933), instead of the reprojected HCAF raster from before. This way, every cell gets its own unique ID. We will then use `values()` to put rasters into the dataframe as columns, aligned with cell IDs, so we can later use subs() to put the data back into a spatial raster.
Let's put together an analysis where we can identify priority areas for protection. Some goals, and a quickie math formula to try to quantify these goals:
* prioritize areas with high probability of containing both priority shark spp: hammerhead and silky sharks
* prioritize areas with a high overall shark species richness
* protecting areas closer to landing sites (ports) will impose greater costs on fishermen, but areas farther from ports are more difficult to patrol and enforce protection, so we want to find a happy medium.
* the cost function might look something like $aD^2_{port} - bD_{port} + c$, where the first term makes it more expensive to protect the farther out you go (e.g., more gas to patrol), and the second term reflects lower costs imposed on local fishermen the farther you go out.
So we can write this as a rough equation like this, where we want to maximize the outcome:
$$Priority = \frac{P_{hh} \times P_{ss} \times \text{(relative species richness)}^\alpha}{\text{(cost function)}^\beta}$$
where $P_x$ is probability of finding shark type $x$, $\alpha$ is how strongly species richness should be weighted (higher $\alpha$, more weight given to overall richness relative to the two priority spp), and $\beta$ is relative weight given to cost function.
Then we can break the priority into quantiles, and highlight the best 10% of area and the worst 10% of area.
```{r}
cellid_rast <- raster(eez_rast) %>%
setValues(1:ncell(eez_rast))
### load the hammerhead and spp richness rasters from the last script...
hh_rast <- raster(here('_spatial/hh_rast_6933.tif'))
spp_rich_rast <- raster(here('_spatial/spp_richness_6933.tif'))
shark_df <- data.frame(cell_id = values(cellid_rast),
p_silky = values(ss_rast),
p_hhead = values(hh_rast),
spp_rich = values(spp_rich_rast),
eez = values(eez_rast),
d_port = values(port_dist_rast) / 1000) %>%
### now we can drop any cells outside the Mozambique EEZ (id = 5)
filter(eez == 5)
### Now lets build our model
alpha <- 1; beta <- 1; a <- 2; b <- 1000; c <- 2e5
model_df <- shark_df %>%
mutate(priority_spp = p_silky * p_hhead,
rel_spp_rich = spp_rich / max(spp_rich, na.rm = TRUE),
cost_fxn = a * d_port^2 - b * d_port + c) %>%
mutate(priority = priority_spp * rel_spp_rich^alpha / cost_fxn^beta,
pri_qtile = ntile(priority, n = 10))
ggplot(model_df, aes(x = d_port, y = cost_fxn)) +
geom_point()
model_map <- subs(cellid_rast, model_df, by = 'cell_id', which = 'priority')
plot_rast(log10(model_map), moz_eez_sf, palette = 'viridis')
```
``` {r}
x <- model_df %>%
filter(pri_qtile %in% c(1, 10))
priority_map <- subs(cellid_rast, x, by = 'cell_id', which = 'pri_qtile')
plot_rast(priority_map, moz_eez_sf, palette = 'RdYlGn', outline = 'yellow') +
geom_point(data = port_df, aes(x, y), shape = 21, color = 'black', fill = 'white', size = 2)
```
# Raster to polygons
Most of the time we would be turning polygons into rasters, to analyze with other rasters. But for their analysis, the Future4Fins group is planning to use `prioritizr`, essentially an R implementation of the conservation planning software Marxan. But `prioritizr` wants planning unit data as polygons, not a raster! Let's turn our Silky Shark raster into polygons, each cell becoming a (square-shaped) polygon of its own. NOTE: for the purposes of this workshop, let's use the half-degree cells version - because at 10 km cells, this might be a bit processing intensive, and the results won't look good on a plot. For the real analysis they could consider the 10 km x 10 km cells to get a better resolution.
The rasterToPolygons function returns an old school Spatial (`sp`) object, so we can use the `st_as_sf()` (basically `as(x, 'sf')`) function to turn it into a happy little Simple Features `sf` object! And we can easily write this out as a geopackage `.gpkg` file.
``` {r}
### load the half-degree silky shark data
ss_rast_hcaf <- raster(here('_spatial/silkyshark_rast.tif'))
ss_poly_sp <- raster::rasterToPolygons(ss_rast_hcaf, dissolve = FALSE)
ss_poly_sf <- st_as_sf(ss_poly_sp)
ggplot(ss_poly_sf) +
geom_sf(aes(fill = silkyshark_rast), size = .1)
write_sf(ss_poly_sf, here('_spatial/silkyshark_prob.gpkg'))
```
# Review: what have we done in this script?
* Practiced some basic raster math
* Used the `calc()` function to calculate across layers of a `RasterStack`
* Created a raster of distance from a cell to the nearest non-NA cell
* Used `reclassify()` to replace continuous values in a raster with discrete categories, similar to `subs()`
* Used `zonal()` to calculate zonal statistics (e.g., mean value across an EEZ)
* Put a bunch of rasters into a dataframe, calculated a complicated model, and then put the results back into a raster using `subs()`
* Converted a raster back into polygons (both Spatial `sp` object and Simple Features `sf` object) and saved it.
* Ate tiny sandwiches at home (optional)