spatial-mlr3.Rmd

---
title: A Spatial Statistics Playground w/ mlr3
output:
  html_document:
    anchor_sections: true
    code_folding: show
    css: [assets/style.css]
    df_print: default
    lib_dir: docs/libs
    math_method:
      engine: katex
    self_contained: FALSE
    toc: true
    toc_depth: 6
    toc_float:
      collapsed: false
    extra_dependencies:
      !expr list(htmltools::htmlDependency('chunk-names', '1.0', paste0(getwd(), '/assets'), script='chunk-names.js', all_files = FALSE))
---

```{r setup, include=FALSE}
library(knitr)

# chunk output ----
options(digits.secs = 3)
opts_chunk$set(results = 'hold')
opts_chunk$set(message = FALSE)
# set code width for default print method - max 10000
knit_hooks$set(print.width = function(before, options) {
  if (!is.null(options$print.width)) {
    if (before) {
      if (options$print.width > 10000) options$print.width <- 10000
      options(width = options$print.width)
    } else {
      options(width = 80)
    }
  }
})

# globals ----
output <- "docs"

# styling ----
if (!dir.exists(paste0(output, '/assets/'))) dir.create(paste0(output, '/assets/'), recursive = T)
file.copy("assets/style.css", sprintf("%s/assets/", output), overwrite = T, copy.date = TRUE)

# images - path, compression & resolution ----
fig.path <- paste0(output, '/assets/img/')
if (!dir.exists(fig.path)) dir.create(fig.path, recursive = T)
opts_chunk$set(fig.path = fig.path)
knit_hooks$set(pngquant = hook_pngquant)
opts_chunk$set(pngquant = '')
opts_hooks$set(fig.screen = function(options) {
  mbp14_dpi <- 254/1.5
  mbp14_width <- 3023
  mbp14_height <- 1889
  if (options$fig.screen == TRUE) {
    options$dpi <- mbp14_dpi
    options$fig.dim = c(mbp14_width/mbp14_dpi, mbp14_height/mbp14_dpi)
  }
  options
})

# chunk names as headers & anchors ----
opts_chunk$set(link = TRUE)
knit_hooks$set(link = function(before, options) {
  if (options$link == TRUE & ! grepl("^unnamed-chunk-[0-9]*$", options$label)) {
    if (before) {
      h <- 6
      paste0(
        '<div id="_', gsub("\\.", "", options$label), '_" class="chunk section level', h, '">',
        '<h', h, ' class="hasAnchor">', options$label,
        '<a href="#_', gsub("\\.", "", options$label),
        '_" class="anchor-section" aria-label="Anchor link to header"></a>',
        '</h', h, '>'
      )
    } else {
      '</div>'
    }
  }
})
```

## Foreward

Here I explore handling spatial data to build predictive models. And in particularly I use the [mlr3](https://mlr3.mlr-org.com) ecosystem as my modeling framework. Much of this code comes from the [Resources](#resources) gathered below. 

All of the computations are ran via GitHub Actions. Thus, the [hardware / compute specs](https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources) are minimal - but the workflow to render this R Markdown is entirely reproducible. 

Enjoy.

## Setup

```{r packages, warning=FALSE}
# from pca-mlr3-pipelines
library(mlr3verse)
library(data.table)
library(future)
library(igraph)
library(ggfortify)
library(scattermore)
library(R6)
library(rlang)
# from Statistical Learning | Geocomputation with R
library(lgr)
library(sf)
library(terra)
library(progressr)
library(mlr3spatiotempcv)
library(spDataLarge)
library(tmap)
library(tmaptools)
library(raster)
library(pROC)
library(tictoc)
library(ggplot2)
library(mlr3extralearners)
```

```{r mlr3verse_info}
mlr3verse_info()
```

## Resources

- [Geocomputing with R](https://geocompr.robinlovelace.net)
    - [Chapter 12 Statistical learning](https://geocompr.robinlovelace.net/spatial-cv.html)
        - https://github.com/Robinlovelace/geocompr/blob/main/12-spatial-cv.Rmd
        - https://github.com/Robinlovelace/geocompr/blob/main/code/12-cv.R
- [Chapter 8.3 Spatiotemporal Analysis | mlr3 Book](https://mlr3book.mlr-org.com/special-tasks.html#spatiotemporal)
- [Handling of Spatial Data | mlr v2 articles](https://mlr.mlr-org.com/articles/tutorial/handling_of_spatial_data.html)
- [Spatial Data Science](https://keen-swartz-3146c4.netlify.app)
    - [Chapter 12 Spatial Interpolation](https://keen-swartz-3146c4.netlify.app/interpolation.html)
    - [Chapter 16 Spatial Regression](https://keen-swartz-3146c4.netlify.app/spatglmm.html)

## Landslide Susceptibility

### Data

```{r data}
data("lsl", "study_mask", package = "spDataLarge")
lsl <- as.data.table(lsl)
ta <- terra::rast(system.file("raster/ta.tif", package = "spDataLarge"))
```

```{r map-landslides}
lsl_sf <- st_as_sf(lsl, coords = c("x", "y"), crs = "EPSG:32717")

# terra generates errors if this object is called named 'slope'
slope1 <- ta$slope * pi / 180
aspect <- terra::terrain(ta$elev, v = "aspect", unit = "radians")
hs <- terra::shade(slope = slope1, aspect = aspect)
# so far tmaptools does not support terra objects
bbx <- tmaptools::bb(
  raster::raster(hs), xlim = c(-0.0001, 1),
  ylim = c(-0.0001, 1), relative = TRUE
)

map <- tm_shape(hs, bbox = bbx) +
  tm_grid(
    col = "black", n.x = 1, n.y = 1, labels.inside.frame = FALSE,
    labels.rot = c(0, 90), lines = FALSE
  ) +
  tm_raster(palette = gray(0:100 / 100), n = 100, legend.show = FALSE) +
  tm_shape(ta$elev) +
  tm_raster(alpha = 0.5, palette = terrain.colors(10), legend.show = FALSE) +
  tm_shape(lsl_sf) +
  tm_bubbles(
    "lslpts", size = 0.2, palette = "-RdYlBu", title.col = "Landslide: "
  ) +
  tm_layout(inner.margins = 0) +
  tm_legend(bg.color = "white")
```

```{r map-landslides-fig, echo=FALSE, cache=FALSE}
map
```

```{r lsl}
lsl
```

### Spatial Bias

#### GLM Classif - Predictive Model

```{r glm-fit-pred}
fit <- glm(
  lslpts ~ slope + cplan + cprof + elev + log10_carea,
  family = binomial(),
  data = lsl
)

pred <- terra::predict(ta, model = fit, type = "response")
```

```{r map-landslides-glm-bias-pred}
sv_study_mask <- terra::vect(study_mask)

map_glm <- tm_shape(hs, bbox = bbx) +
  tm_grid(
    col = "black", n.x = 1, n.y = 1, labels.inside.frame = FALSE,
    labels.rot = c(0, 90), lines = FALSE
  ) +
  tm_raster(palette = "white", legend.show = FALSE) +
  # hillshade
  tm_shape(terra::mask(hs, sv_study_mask), bbox = bbx) +
  tm_raster(palette = gray(0:100 / 100), n = 100, legend.show = FALSE) +
  # prediction raster
  tm_shape(terra::mask(pred, sv_study_mask)) +
  tm_raster(
    alpha = 0.5, palette = "Reds", n = 6, legend.show = TRUE, title = "Susceptibility"
  ) +
  tm_layout(
    legend.position = c("left", "bottom"), legend.title.size = 0.9, inner.margins = 0
  )
```

```{r map-landslides-glm-bias-pred-fig, echo=FALSE, cache=FALSE}
map_glm
```

```{r auroc}
pROC::auc(pROC::roc(lsl$lslpts, fitted(fit)))
```

### Spatial CV using mlr3

Here we introduce spatial cross validation to combat spatial autocorrelation and bias. As mentioned previously, we'll use the [mlr3](https://mlr3.mlr-org.com) framework to build our model.

Specifically, we'll run 4 types of models. The first 2 will be specified via an mlr3 'design'. A design is a table of scenarios (models to be evaluated) of unique combinations of [`Task`](https://mlr3.mlr-org.com/reference/Task.html), [`Learner`](https://mlr3.mlr-org.com/reference/Learner.html), and [`Resampling`](https://mlr3.mlr-org.com/reference/Resampling.html) objects.

We create the design using `benchmark_grid` and can run the design object using `benchmark`. 

We will parallelize the execution of our models as much as possible. This depends on the number of cores that are available on our machine and by how we instruct `future`s to be resolved by specifying the [future topology](https://future.futureverse.org/articles/future-3-topologies.html) via `future::plan`. An optimal approach can also depend on types of resampling methods used by each of our models (e.g. nested resampling can be executing in parallel in the inner resampling loop), as well as the dimensions of our design. 

#### GLM Classif - Model Evaluation

The point of this section is to retrieve a bias-reduced performance estimate. We'll do this by using the logistic regression learner "classif.log_reg" first for predicting landslide susceptibility. 

In the next section, we will use a learner that has hyperparameters (an SVM), which we will tune. The goal of training that model will be to maximize predictive performance. 

##### Task 

Note, both `TaskClassifST$new` & `as_task_classif_st` can accept `sf` objects, e.g. our `lsl_sf` object. When this is the case, spatial metadata can be extracted and used for input arguments to the task (e.g. `coordinate_names` & `crs` in the `extra_args` list).

However, apparently the task converts the `sf` object into a `data.table` object, which we know could become memory intensive when handling large data.

By default, all variables *other* than the `target` parameter & the `coordinate_names` within the `backend`/`x` object are used as predictor variables. By default `coords_as_features` is set to `FALSE`, which instructs the task to not use `coordinate_names` as predictors. Set this to `TRUE` to use them as predictors.

```{r task_new}
task_new <- mlr3spatiotempcv::TaskClassifST$new(
  id = "lsl",
  backend = mlr3::as_data_backend(lsl), 
  target = "lslpts", 
  positive = "TRUE",
  extra_args = list(
    coordinate_names = c("x", "y"),
    coords_as_features = FALSE,
    crs = "EPSG:32717"
  )
)

task_new
```

```{r task}
task <- as_task_classif_st(
  x = lsl,
  target = "lslpts",
  positive = "TRUE",
  coordinate_names = c("x", "y"),
  coords_as_features = FALSE,
  crs = "EPSG:32717"
)

task
```

Creating the tasks via the above methods seem to make them identical, but that's not ***strictly*** the case. 

```{r}
identical(task_new, task)
```

```{r plt_duo}
mlr3viz::autoplot(task, type = "duo")
```

```{r plot_pairs, fig.screen=TRUE}
mlr3viz::autoplot(task, type = "pairs")
```

##### Learner

We'll use a logistic regression learner for this task since the response variable of `lsl$lslpts` is binary. 

```{r print.width=10000}
as.data.table(mlr_learners) %>% `[`(key == "classif.log_reg")
```

```{r}
learner <- lrn("classif.log_reg", predict_type = "prob")
# to make sure that training does not stop b/c of any failing models, we define a fallback learner
learner$fallback <- lrn("classif.featureless", predict_type = "prob")
```

##### Resampling

```{r}
resamplings <- list(
  rsmp("repeated_spcv_coords", folds = 5, repeats = 100),
  rsmp("repeated_cv", folds = 5, repeats = 100)
)
```

##### Design & Benchmark Grid

```{r}
design <- benchmark_grid(
  tasks = task,
  learners = learner,
  resamplings = resamplings
)

design
```

##### Execution - Training

Set seed for reproducibility.

```{r}
set.seed(1)

plan(multisession)
```

```{r cache=TRUE, cache.lazy=FALSE}
lgr::get_logger("mlr3")$set_threshold("warn")

tic()
progressr::with_progress(
  bmr <- benchmark(
    design = design,
    store_models = FALSE,
    store_backends = FALSE,
    encapsulate = "evaluate"
  )
)
toc()
```

##### Model Performance Evaluation

```{r}
p_auroc <- autoplot(bmr, measure = msr("classif.auc"))
p_auroc$labels$y = "AUROC"
p_auroc$layers[[1]]$aes_params$fill = c("lightblue2", "mistyrose2")

p_auroc + scale_x_discrete(labels=c("spatial CV", "conventional CV"))
```

```{r}
autoplot(bmr) + scale_x_discrete(labels=c("spatial CV", "conventional CV"))
```

#### SVM Classif - Predictive Model

##### Learner

```{r print.width=10000}
as.data.table(mlr_learners) %>% `[`(grepl("svm", key) & task_type == "classif")
```

```{r}
lrn_ksvm <- lrn("classif.ksvm", predict_type = "prob", kernel = "rbfdot", type = "C-svc")
# to make sure that tuning does not stop b/c of any failing models, we define a fallback learner
lrn_ksvm$fallback <- lrn("classif.featureless", predict_type = "prob")
```

##### Hyperparameter Tuner Strategy

```{r}
# five spatially disjoint k-means partitions
tune_level_spcv <- rsmp("spcv_coords", folds = 5)
# randomly sample partitions
tune_level_cv <- rsmp("cv", folds = 5)
# use 50 randomly selected hyperparameters
terminator <- trm("evals", n_evals = 50)
tuner <- tnr("random_search")
# define the outer limits of the randomly selected hyperparameters
search_space <- paradox::ps(
  C = paradox::p_dbl(lower = -12, upper = 15, trafo = function(x) 2^x),
  sigma = paradox::p_dbl(lower = -15, upper = 6, trafo = function(x) 2^x)
)
```

```{r}
at_ksvm_spcv = mlr3tuning::AutoTuner$new(
  learner = lrn_ksvm,
  resampling = tune_level_spcv, # spatially disjoint k-fold k-means partitioning
  measure = mlr3::msr("classif.auc"), # performance measure
  terminator = terminator, # n iterations of unique randomly selected hyperparameters
  tuner = tuner, # specify random search
  search_space = search_space, # predefined hyperparameter search space
  store_models = TRUE
)

at_ksvm_cv = mlr3tuning::AutoTuner$new(
  learner = lrn_ksvm,
  resampling = tune_level_cv,
  measure = mlr3::msr("classif.auc"),
  terminator = terminator,
  tuner = tuner,
  search_space = search_space
)
```

##### Execution - Training

```{r}
set.seed(1)

plan(multisession)
```

```{r train-auto, cache=TRUE, cache.lazy=FALSE}
lgr::get_logger("mlr3")$set_threshold("warn")
lgr::get_logger("bbotk")$set_threshold("warn")

tic()
progressr::with_progress(
  {
    at_ksvm_spcv$train(task = task)
    at_ksvm_cv$train(task)
  }
)
toc()

# explicit assignment for caching entire object
at_ksvm_spcv <- at_ksvm_spcv
at_ksvm_cv <- at_ksvm_cv
```

```{r cache=TRUE, dependson='train-auto'}
at_ksvm_spcv$model$learner$state$model
```

```{r print.width=10000, cache=TRUE, dependson='train-auto'}
at_ksvm_spcv$model$tuning_instance
```

```{r cache=TRUE, dependson='train-auto'}
at_ksvm_spcv$tuning_result$learner_param_vals
```

```{r cache=TRUE, dependson='train-auto'}
at_ksvm_spcv$model$tuning_instance$archive$benchmark_result$resample_results$resample_result[[1]]
```

```{r print.width=10000, cache=TRUE, dependson='train-auto'}
at_ksvm_spcv$model$tuning_instance$archive$benchmark_result$resample_results$resample_result[[1]]$score()
```

```{r cache=TRUE, dependson='train-auto'}
at_ksvm_spcv$model$tuning_instance$archive$benchmark_result$resample_results$resample_result[[1]]$learner
```

```{r cache=TRUE, dependson='train-auto'}
at_ksvm_spcv$model$tuning_instance$archive$benchmark_result$resample_results$resample_result[[1]]$learners
```

```{r cache=TRUE, dependson='train-auto'}
at_ksvm_spcv$model$tuning_instance$archive$benchmark_result$resample_results$resample_result[[1]]$learners[[1]]$model
at_ksvm_spcv$model$tuning_instance$archive$benchmark_result$resample_results$resample_result[[1]]$learners[[5]]$model
```

##### k-folds CV Spatial Partitioning  

```{r fig.screen=TRUE, cache=FALSE, dependson='train-auto'}
autoplot(
  object = at_ksvm_spcv$archive$benchmark_result$resamplings$resampling[[1]],
  task = task, fold_id = 1:5
)
```

##### k-folds CV Random Partitioning

```{r fig.screen=TRUE, cache=FALSE, dependson='train-auto'}
autoplot(
  object = at_ksvm_cv$archive$benchmark_result$resamplings$resampling[[1]],
  task = task, fold_id = 1:5
)
```

##### Prediction

```{r preds_svm, cache=FALSE, cache.lazy=FALSE, dependson='train-auto'}
tic()
pred_spcv = terra::predict(
  object = ta, model = at_ksvm_spcv$model$learner$state$model,
  type = "probabilities", na.rm = T
)
pred_cv = terra::predict(
  object = ta, model = at_ksvm_cv$model$learner$state$model,
  type = "probabilities", na.rm = T
)
toc()
```

##### Prediction Maps

Here we display landslide susceptibility.

```{r}
tmap_pred <- function(preds, hs, bbx, sv_study_mask, palette) {
  t_map <- tm_shape(shp = hs, bbox = bbx) +
    tm_grid(
      col = "black", n.x = 1, n.y = 1, labels.inside.frame = FALSE,
      labels.rot = c(0, 90), lines = FALSE
    ) +
    tm_raster(palette = "white", legend.show = FALSE) +
    # hillshade
    tm_shape(terra::mask(x = hs, mask = sv_study_mask), bbox = bbx) +
    tm_raster(palette = gray(0:100 / 100), n = 100, legend.show = FALSE) +
    # add prediction raster
    tm_shape(terra::mask(x = preds, mask = sv_study_mask)) +
    tm_raster(
      alpha = 0.5, palette = palette, n = 6, legend.show = TRUE, title = "Susceptibility"
    ) +
    tm_layout(
      legend.position = c("left", "bottom"), legend.title.size = 0.9, inner.margins = 0
    )
  return(t_map)
} 
```

```{r maps-svm, cache=FALSE, dependson='preds_svm'}
map_svm_spcv <- tmap_pred(
  preds = pred_spcv[[1]], hs = hs, bbx = bbx, sv_study_mask = sv_study_mask, palette = "Reds"
)
map_svm_cv <- tmap_pred(
  preds = pred_cv[[1]], hs = hs, bbx = bbx, sv_study_mask = sv_study_mask, palette = "Reds"
)
```

```{r fig.screen=TRUE, cache=FALSE, dependson=c('maps-svm')}
tmap_arrange(map, map_glm, map_svm_spcv, map_svm_cv)
```

##### Prediction Data

```{r fig.screen=TRUE, cache=FALSE}
plot(ta)
```