forked from corneliussenf/phenoBayes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
179 lines (117 loc) · 8.67 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
# phenoBayes: Bayesian hierarchical models for estimating spatial and temporal patterns in vegetation phenology from Landsat time series
This repository describes and documents a set of models used to estimate spatial and temporal patterns in vegetation phenology from Landsat data.
**References**
Senf, C., Pflugmacher D., Heurich, M. and Krueger T. (2017) A Bayesian hierarchical model for estimating spatial and temporal variation in vegetation phenology from Landsat time series. *Remote Sensing of Environment*, 194, 155-160.
**How to cite the code**
If you use the code you might want to cite the above mentioned publication and the following DOI:
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.260099.svg)](https://doi.org/10.5281/zenodo.260099)
**License**
This work is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International][1] license.
[1]: https://creativecommons.org/licenses/by-sa/4.0/
## Introduction
Yet there is only one type of models implemented, the **base phenological model**. The base phenological model takes a sample of dense Landsat time series and estimates a set of five spring phenological parameters (season spectral minimum, season spectral magnitude, start of season, green-up rate, and summer green-down). Simultaneously, the model estimates the temporal (inter-annual) variation in the start of season parameter as random effect. That way, changes in spring phenology can be investigated over the time period 1985 to 2016 at any location in the world.
In order to run the models presented here, you need to first install `rstan`. Information on how to installing `rstan` on your machine can be found here: https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started.
## Base phenological model
### Simulated data
Using the `simulate_data` function in the R folder we can simulate a set of time series resembling the properties of Landsat time series acquired over a braodleaved forest. There are several parameters that can be adjusted, yet we stick to the defaults and simulate 9 time series:
```{r}
source("R/simulate_data.R")
dat <- simulate_data(M = 9)
dat_spring <- subset(dat$data, dat$data$doy <= 200)
```
**Note:** In order for the simulation function to work properly, the `MASS` package needs to be installed!
The function returns a list containing 1) a data frame with the simulated data, 2) the parameters used for creating the simulated data, and 3) a plot of the simulated data. Following the simulation of the data, we subset the data to only include winter and spring observations. This step simply reduces the number of data points and thus the computational resources needed for sampling the posteriors.
Using the simulated data we can set up the model via the `rstan` interface. The base phenological model for the start of season is called *base_pheno_spring.stan* and is located in the Stan folder:
```{r}
# Load library
library(rstan)
# Set rstan settings
rstan_options(auto_write = TRUE)
options(mc.cores = 4)
# Define a centering vector for centering the mean of the five phenological parameters
mu_beta <- c(0, # Season minimum
0.5, # Season amplitude
0.15, # Green-up rate
120, # Start of season
0.0005) # Summer green-down
mu_beta <- matrix(mu_beta, ncol = max(dat_spring$pixel), nrow = 5) # Translate into a matrix with dimension: 5*n_pixel
# Define a scaling vector for scaling the variances of the five phenological parameters
sigma_beta_scale <- c(0, # Season minimum
0.05, # Season amplitude
0.05, # Green-up rate
5, # Start of season
0.00001) # Summer green-down
# Gather the data into a list in order to pass it to Stan
datalist <- list(N = nrow(dat_spring),
NY = max(dat_spring$year),
NP = max(dat_spring$pixel),
doy = dat_spring$doy,
vi = dat_spring$vi,
year = dat_spring$year,
pixel = dat_spring$pixel,
mu_beta = mu_beta,
sigma_beta_scale = sigma_beta_scale)
# Pass the datalist and the model to Stam
fit <- stan(file = "Stan/pheno_base_spring.stan", data = datalist, iter = 2000, chains = 4)
```
First, we load the `rstan` library and tell the package to use four cores in parallel. Second, we define two vectors used for centering and scaling the five phenological parameters. This step is purely for numerical reasons and giving informative centering and scaling vectors significantly increases sampling speed. The vectors can be formulated based on prior knowledge or based on other data sources (i.e., climate data or phenological records). Here we set the vector according to the parameters used to simulate the time series. Third, we need to translate the data into a list in order to call it with `rstan` The data list is then given to the `stan` function together with the phenological model in the final line. The sampling is set to 2,000 interations (including 1,000 warm-up iterations) and four chains (run in parallel using four cores).
There are several checks you need to perform on the resulting `stanfit` object in order to trust your results (i.e., convergence of chains, mixing between chains, among others). We won't go into detail here, but we strongly recommend reading the `rstan` documentation: https://cran.r-project.org/web/packages/rstan/vignettes/rstan.html. A relativel frequent convergence issue is called **divergent transitions**. Divergent trasitions indicate a serious problem that can either be erased by adapting the sampler settings (see: http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup) or by giving more informative centering and scaling vectors.
We can inspect the posteriors using the `parameter_summary` function in the R folder. In particular, we can inspect the temporal variation in the start of season by plotting its median and 95% credible interval over the years:
```{r}
source("R/parameter_summary.R")
phi_summary <- parameter_summary(fit, type = "temporal")
# Note that we make use of the ggplot2 package, which you need to load
library(ggplot2)
ggplot(phi_summary, aes(x = year, y = Q50)) +
geom_point() +
geom_errorbar(aes(ymin = Q025, ymax = Q975)) +
labs(x = "Year", y = "Variation in SOS")
```
Similarly, we can compare the spatial distribution of phenological parameters among the nine pixels:
```{r}
beta_summary <- parameter_summary(fit, type = "spatial")
ggplot(beta_summary, aes(x = factor(pixel), y = Q50)) +
geom_point() +
geom_errorbar(aes(ymin = Q025, ymax = Q975)) +
labs(x = "Pixel", y = "Estimate") +
facet_wrap(~parameter, scales = "free")
```
### Real data
We repeat the above shown exercise with real Landsat data aquired over broadleaved forest stands in the Bavarian Forest National Park in southern Germany. The data set is located in the data folder and stored as RData object. The data totals 100 Enhanced Vegetation Index (EVI) time series and are identical to the data used in the above mentioned publication.
```{r}
load("data/landsat_broadleaved_bavarian_forest.RData")
dat_spring <- subset(dat, dat$doy <= 200)
mu_beta <- matrix(c(0, 0.3, 0.15, 120, 0), ncol = max(dat$pixel), nrow = 5)
sigma_beta_scale <- c(0.1, 0.1, 0.1, 10, 0.001)
datalist <- list(N = nrow(dat_spring),
NY = max(dat_spring$year_index),
NP = max(dat_spring$pixel),
doy = dat_spring$doy,
vi = dat_spring$evi,
year = dat_spring$year_index,
pixel = dat_spring$pixel,
mu_beta = mu_beta,
sigma_beta_scale = sigma_beta_scale)
fit <- stan(file = "Stan/pheno_base_spring.stan", data = datalist, iter = 2000, chains = 4)
```
We here also can inspect the temporal and spatial patterns of phenology:
```{r}
# Temporal
phi_summary <- parameter_summary(fit, type = "temporal")
ggplot(phi_summary, aes(x = year + 1984, y = Q50)) +
geom_point() +
geom_errorbar(aes(ymin = Q025, ymax = Q975)) +
labs(x = "Year", y = "Variation in SOS")
# Spatial
beta_summary <- parameter_summary(fit, type = "spatial")
ggplot(beta_summary, aes(x = factor(pixel), y = Q50)) +
geom_point() +
geom_errorbar(aes(ymin = Q025, ymax = Q975)) +
labs(x = "Pixel", y = "Estimate") +
facet_wrap(~parameter, scales = "free")
```
## Future models
We will gradually add a variety of models:
- A base autumn model
- Spring/autumn models containing temporal predictors (i.e., climate)
- Spring/autumn models containing spatial predictors (i.e., topography)