-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmgcviz.Rmd
500 lines (403 loc) · 20.4 KB
/
mgcviz.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
---
title: "An introduction to mgcViz: visual tools for GAMs"
author: "Matteo Fasiolo and Raphael Nedellec"
date: "`r Sys.Date()`"
output:
html_document:
toc: true
number_sections: true
vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{mgcViz_vignette}
%\VignetteEncoding{UTF-8}
---
<style>
body {
text-align: justify}
</style>
```{r setup, include=FALSE}
library(knitr)
library(rgl)
opts_chunk$set(out.extra='style="display:block; margin: auto"', fig.align="center", tidy=FALSE)
knit_hooks$set(webgl = hook_webgl)
if (!requireNamespace("rmarkdown", quietly = TRUE) ||
!rmarkdown::pandoc_available("1.14")) {
warning(call. = FALSE, "These vignettes assume rmarkdown and pandoc
version 1.14. These were not found. Older versions will not work.")
knitr::knit_exit()
}
```
# mgcViz basics
The `mgcViz` R package (Fasiolo et al, 2018) offers visual tools for Generalized Additive Models (GAMs). The visualizations provided by `mgcViz` differs from those implemented in `mgcv`, in that most of the plots are based on `ggplot2`'s powerful layering system. This has been implemented by wrapping several `ggplot2` layers and integrating them with computations specific to GAM models. Further, `mgcViz` uses binning and/or sub-sampling to produce plots that can scale to large datasets ($n \approx 10^7$), and offers a variety of new methods for visual model checking/selection.
This document introduces the following categories of visualizations:
1. **smooth and parametric effect plots**: layered plots based on `ggplot2` and interactive 3d visualizations based on the `rgl` library;
2. **model checks**: interactive QQ-plots, traditional residuals plots and layered residuals checks along one or two covariates;
3. **special plots**: differences-between-smooths plots in 1 or 2D and plotting slices of multidimensional smooth effects.
## Layered smooth effect plots
Here we describe effect-specific plotting methods and then we move to the `plot.gamViz` function, which wraps several effect plots together.
### Effect-specific plots
Let's start with a simple example with two smooth effects:
```{r 1, message = F}
library(mgcViz)
n <- 1e3
dat <- data.frame("x1" = rnorm(n), "x2" = rnorm(n), "x3" = rnorm(n))
dat$y <- with(dat, sin(x1) + 0.5*x2^2 + 0.2*x3 + pmax(x2, 0.2) * rnorm(n))
b <- gam(y ~ s(x1) + s(x2) + x3, data = dat, method = "REML")
```
Now we convert the fitted object to the `gamViz` class. Doing this allows us to use the tools offered by `mgcViz` and to save quite a lot of time when producing multiple plots using the same fitted GAM model.
```{r 2}
b <- getViz(b)
```
We extract the first smooth component using the `sm` function and we plot it.
The resulting `o` object contains, among other things, a `ggplot` object. This
allows us to add several visual layers.
```{r 3}
o <- plot( sm(b, 1) )
o + l_fitLine(colour = "red") + l_rug(mapping = aes(x=x, y=y), alpha = 0.8) +
l_ciLine(mul = 5, colour = "blue", linetype = 2) +
l_points(shape = 19, size = 1, alpha = 0.1) + theme_classic()
```
We added the fitted smooth effect, rugs on the x and y axes, confidence lines at 5 standard deviations, partial residual points and we changed the plotting theme to `ggplot2::theme_classic`. Functions such as `l_fitLine` or `l_rug` are effect-specific layers. To see all the layers available for each effect plot we do:
```{r 5}
listLayers(o)
```
Similar methods exist for 2D smooth effect plots, for instance if we fit:
```{r 6}
b <- gam(y ~ s(x1, x2) + x3, data = dat, method = "REML")
b <- getViz(b)
```
we can do
```{r 7}
plot(sm(b, 1)) + l_fitRaster() + l_fitContour() + l_points()
```
<!-- This can be converted to an interactive `plotly` plot as follows: -->
<!-- ```{r 8, eval = FALSE} -->
<!-- # Cannot run this when building the pdf for this vignette, but do try it! -->
<!-- library(plotly) -->
<!-- ggplotly( plot(sm(b, 1)) + l_fitRaster() + l_points() + l_fitContour() ) -->
<!-- ``` -->
We can extract the parametric effect `x3` using the `pterm` function (which is
the parametric equivalent of `sm`). We can then plot the two effects on a grid using
the `gridPrint` function:
```{r 8a, message = F, warning = F, fig.width=10, fig.height=4}
gridPrint(plot(sm(b, 1)) + l_fitRaster() + l_fitContour() + labs(title = NULL) + guides(fill=FALSE),
plot(pterm(b, 1)) + l_ciPoly() + l_fitLine(), ncol = 2)
```
If needed, we can convert a `gamViz` object back to its original form by doing:
```{r 9}
b <- getGam(b)
class(b)
```
The only reason to do so might be to save some memory (`gamViz` objects store some extra objects).
### The `plot.gamViz` method
The `plot.gamViz` is the `mgcViz`'s equivalent of `mgcv::plot.gam`. This function
wraps together the plotting methods related to each specific smooth or parametric effect, which can
save time when doing GAM modelling. Consider this model:
```{r 10, results='hide'}
dat <- gamSim(1,n=1e3,dist="normal",scale=2)
dat$fac <- as.factor( sample(letters[1:6], nrow(dat), replace = TRUE) )
b <- gam(y~s(x0)+s(x1, x2)+s(x3)+fac, data=dat)
```
To plot all the effects we do:
```{r 11}
b <- getViz(b)
print(plot(b, allTerms = T), pages = 1) # Calls print.plotGam()
```
Here `plot` calls `plot.gamViz`, and setting `allTerms = TRUE` makes so that also the parametric terms
are plotted. We are calling `print` (which uses the `print.plotGam` method) explicitly, because we want to put all plots on one page. Alternatively we could have simply done:
```{r 11a, eval = FALSE}
plot(b)
```
which plots only the smooth effects, diplaying one on each page.
Notice that `plot.gamViz` returns an object of class `plotGam`, which is initially empty.
The layers in the previous plots (e.g. the rug and the confidence interval lines) have been
added by `print.plotGam`, which adds some default layers to empty `plotGam` objects. This can be
avoided by setting `addLay = FALSE` in the call to `print.plotGam`. A `plotGam` object in
considered not empty if we added objects of class `gamLayer` to it, for instance:
```{r 13}
pl <- plot(b, allTerms = T) + l_points() + l_fitLine(linetype = 3) + l_fitContour() +
l_ciLine(colour = 2) + l_ciBar() + l_fitPoints(size = 1, col = 2) + theme_get() + labs(title = NULL)
pl$empty # FALSE: because we added gamLayers
print(pl, pages = 1)
```
Here all the functions starting with `l_` return `gamLayer` objects. Notice that some layers
are not relevant to all smooths. For instance, `l_fitContour` is added only to the second smooth.
The `+.plotGam` method automatically adds each layer only to compatible effect plots.
We can plot individuals effects by using the `select` arguments. For instance:
```{r 14}
plot(b, select = 1)
```
where only the default layers are added. Obviously we can have our custom layers instead:
```{r 15}
plot(b, select = 1) + l_dens(type = "cond") + l_fitLine() + l_ciLine()
```
where the `l_dens` layer represents the conditional density of the partial residuals. Parametric effects
always come after smooth or random effects, hence to plot the factor effect we do:
```{r 15a}
plot(b, allTerms = TRUE, select = 4) + geom_hline(yintercept = 0)
```
### Interactive `rgl` smooth effect plots
`mgcViz` provides tools for generating interactive plots of multidimensional smooths
via the `rgl` R package. Here is an example where we are plotting a 2D slice of
a 3D smooth effect, with confidence surfaces:
```{r 16, warning = F, webgl=TRUE}
library(mgcViz)
n <- 500
x <- rnorm(n); y <- rnorm(n); z <- rnorm(n)
ob <- (x-z)^2 + (y-z)^2 + rnorm(n)
b <- gam(ob ~ s(x, y, z))
b <- getViz(b)
plotRGL(sm(b, 1), fix = c("z" = 0), residuals = TRUE)
```
The `fix` argument is used to determine where the 3D effect should be sliced along the z-axis. The plot also shows a subset of residuals (colour-coded depending on sign) that fall close (in term of Euclidean distance) to the selected slice. Notice that `plotRGL` is not layered at the moment, and most options need to be specified in the initial function call. However, the interactive plot can still be manipulated once the `rgl` window is open. For instance here we change the aspect ratio:
```{r 17}
aspect3d(1, 2, 1)
```
We then close the window using `rgl.close()`.
## Model checking
### New version of traditional model checks
Most of the model checks provided by `mgcv` are contained in `qq.gam` and `gam.check`.
`mgcViz` substitutes them with the more advanced `qq.gamViz` and `check.gamViz` methods.
#### The `qq.gamViz` method
Consider the following model with binomial responses:
```{r 19, results='hide'}
set.seed(0)
n.samp <- 400
dat <- gamSim(1,n = n.samp, dist = "binary", scale = .33)
p <- binomial()$linkinv(dat$f) ## binomial p
n <- sample(c(1, 3), n.samp, replace = TRUE) ## binomial n
dat$y <- rbinom(n, n, p)
dat$n <- n
lr.fit <- gam(y/n ~ s(x0) + s(x1) + s(x2) + s(x3),
family = binomial, data = dat,
weights = n, method = "REML")
lr.fit <- getViz(lr.fit)
```
We can get a QQ-plot of the residuals as follows:
```{r 20}
qq(lr.fit, method = "simul1", a.qqpoi = list("shape" = 1), a.ablin = list("linetype" = 2))
```
Here `method` determines the method used to compute the QQ-plot, while the arguments
starting with `a.` are lists that will be passed directly to the corresponding `ggplot2`
layer (`geom_point` and `geom_abline` here). We can remove the confidence intervals and
show all simulated (model-based) QQ-curves as follows:
```{r 21}
qq(lr.fit, rep = 20, showReps = T, CI = "none", a.qqpoi = list("shape" = 19), a.replin = list("alpha" = 0.2))
```
Importantly, `mgcViz::qq.gam` can handle large datasets by discretizing the QQ-plot before
plotting. For instance, let's increase `n.samp` in the previous example:
```{r 22, results='hide'}
set.seed(0)
n.samp <- 20000
dat <- gamSim(1,n=n.samp,dist="binary",scale=.33)
p <- binomial()$linkinv(dat$f) ## binomial p
n <- sample(c(1,3),n.samp,replace=TRUE) ## binomial n
dat$y <- rbinom(n,n,p)
dat$n <- n
lr.fit <- bam(y/n ~ s(x0) + s(x1) + s(x2) + s(x3)
, family = binomial, data = dat,
weights = n, method = "fREML", discrete = TRUE)
lr.fit <- getViz(lr.fit)
```
Here the `discrete` argument determines whether the QQ-plot is discretized or not.
Notice that we can compute the QQ-plot, store it in `o` and then plot it (via `print.qqGam`).
```{r 23}
o <- qq(lr.fit, rep = 10, method = "simul1", CI = "normal", showReps = TRUE,
a.replin = list(alpha = 0.1), discrete = TRUE)
o
```
The coarseness of the discretization grid is determined by the `ngr` argument:
```{r 24}
o <- qq(lr.fit, rep = 10, method = "simul1", CI = "normal", showReps = TRUE,
ngr = 1e2, a.replin = list(alpha = 0.1), a.qqpoi = list(shape = 19))
o
```
We can zoom into particular areas of the plot using the `zoom` generic. Also, given that `qq.gamViz` and `zoom.qqGam` output objects of class `plotSmooth`, we can arrange them using the `gridPrint`:
```{r 24a}
gridPrint(o, zoom(o, xlim = c(2, 2.5), ylim = c(2, 2.5)), ncol = 2)
```
The QQ-plot can also be manipulated interactively using the `shine` generic, which transforms it into a shiny app:
```{r 24b, eval = FALSE}
# Cannot run this when building the pdf for this vignette, but do try it!
shine(o)
```
#### The `check.gamViz` method
The `check.gamViz` method is similar to `mgcv::gam.check`, with the difference that it
produces a sequence of `ggplot` objects and that it sub-samples the residuals to
avoid over-plotting (or stalling entirely) when dealing with large data sets.
Here is an example:
```{r 25}
set.seed(0)
dat <- gamSim(1, n = 200)
b <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat)
b <- getViz(b)
check(b,
a.qq = list(method = "tnorm",
a.cipoly = list(fill = "light blue")),
a.respoi = list(size = 0.5),
a.hist = list(bins = 10))
```
The `a.qq` argument is a list that gets passed directly to `qq.gamViz`. Similarly,
`a.repoi` is passed to `ggplot2::geom_points` and `a.hist` to `ggplot2::geom_hist`.
### New layered model checks
The `qq.gamViz` and `check.gamViz` functions are not layered, and in fact require using lists of arguments
to be passed to the underlying `ggplot2` layers. Instead the methods described in this
section are fully layered, hence easy to extend and customize.
#### One dimensional checks using `check1D`
This function allows to verify how the residuals vary along one covariate. Consider the
following model:
```{r 26}
set.seed(4124)
n <- 5e3
x <- rnorm(n); y <- rnorm(n);
z <- as.factor( sample(letters[1:6], n, replace = TRUE) )
ob <- (x)^2 + (y)^2 + (0.2*abs(x) + 1) * rnorm(n)
b <- gam(ob ~ s(x) + s(y) + z)
b <- getViz(b)
```
Here the responses variance varies a lot along $x$. Assume that we didn't know this, but
that we wanted to find out whether the residuals are heteroscedastic. We can start by doing
the following:
```{r 27, fig.width=6, fig.height=3}
ck1 <- check1D(b, "x")
ck2 <- check1D(b, "z")
gridPrint(ck1, ck2, ncol = 2)
```
This produces two views along $x$ and $z$, but as you can see the plots are initially empty. We might
want to add a layer showing the conditional distribution of the residuals along $x$ and $z$ and another
containing a rug:
```{r 28, fig.width=10, fig.height=4}
gridPrint(ck1 + l_dens(type = "cond", alpha = 0.8) + l_rug(alpha = 0.2),
ck2 + l_points() + l_rug(alpha = 0.2), layout_matrix = matrix(c(1, 1, 1, 2, 2), 1, 5))
```
The left plot suggests that the variance of the residuals might be lower in the middle ($x=0$), but it
is not entirely clear. The `l_densCheck` layer gives a more clear answer in this case:
```{r 29}
ck1 + l_densCheck()
```
This layers adds an heatmap proportional to $\{p(r|x)^{1/2} - p_m(r|x)^{1/2}\}^{1/3}$, where $r$ are the residuals, while $p$ and $p_m$ are their empirical and theoretical (model based) densities. In particular, $p$ is estimated using the the fast k.d.e. method of Wand (1994) (implemented by the `kernSmooth` package) and $p_m$ is a standard normal density here. This plot makes clear that the residuals are over-dispersed when $x$ is far from zero.
The `l_gridCheck1D` provides another way of finding residuals patterns. For instance:
```{r 30, fig.width=10, fig.height=4}
b <- getViz(b, nsim = 50)
gridPrint(check1D(b, "x") + l_gridCheck1D(gridFun = sd, showReps = TRUE),
check1D(b, "z") + l_gridCheck1D(gridFun = sd, showReps = TRUE), ncol = 2)
```
Here we converted `b` again using `getViz` with `nsim = 50`. This is because `l_gridCheck1D` needs some simulations to compute the confidence intervals. The simulations are done by `getViz` and then stored
inside `b`. `l_gridCheck1D` simply bins the residuals according to their $x$ values, and evaluates a user-defined function (`sd` here) over the observed and simulated residuals.
#### Two dimensional checks using `check2D`
`check2D` is quite similar to `check1D`, but looks at the residuals along two covariates. Here is an
example where the mean effect follows the Rosenbrock function:
```{r 31}
set.seed(566)
n <- 5e3
X <- data.frame("x1"=rnorm(n, 0.5, 0.5), "x2"=rnorm(n, 1.5, 1),
"fac"=as.factor( sample(letters[1:6], n, replace = TRUE) ))
X$y <- (1-X$x1)^2 + 100*(X$x2 - X$x1^2)^2 + rnorm(n, 0, 2)
b <- gam(y ~ te(x1, x2, k = 5), data = X)
b <- getViz(b, nsim = 50)
```
We start by generating two 2D views:
```{r 32}
ck1 <- check2D(b, x1 = "x1", x2 = "x2")
ck2 <- check2D(b, x1 = X$fac, x2 = "x2") + labs(x = "fac")
```
Then we add the `l_gridCheck2D` layer:
```{r 33}
ck1 + l_gridCheck2D(gridFun = mean)
ck2 + l_gridCheck2D(gridFun = mean)
```
`l_gridCheck2D` bins the observed and simulated residuals, summarizes them using a scalar-valued
function (`mean` here), and adds a heatmap of the observed summary in each cell, normalized
using the `nsim` summaries obtained using the simulations. In the first plot above, the pattern in the residual means is not very well visible, due to outliers on the far right. The pattern is made more visible by zooming on the center of the distribution and by changing the size of the bins:
```{r 34}
ck1 + l_gridCheck2D(bw = c(0.05, 0.1)) + xlim(-1, 1) + ylim(0, 3)
```
As for smooth effect plots, we can list the available layers by doing:
```{r 35}
listLayers( ck1 )
```
The most sophisticated layer is probably `l_glyphs2D` which we illustrate here using an heteroscedastic model:
```{r 36}
set.seed(4124)
n <- 5e3
dat <- data.frame("x1" = rnorm(n), "x2" = rnorm(n))
dat$y <- (dat$x1)^2 + (dat$x2)^2 + (1*abs(dat$x1) + 1) * rnorm(n)
b <- gam(y ~ s(x1) + s(x2), data = dat)
b <- getViz(b)
ck <- check2D(b, x1 = "x1", x2 = "x2", type = "tnormal")
```
Similarly to `l_gridCheck2D`, `l_glyphs2D` bins the residuals according to two covariates, but the user-defined function used to summarize the residuals in each bin has to return a `data.frame`, rather than a scalar. Here is
an example:
```{r 37}
glyFun <- function(.d){
.r <- .d$z
.qq <- as.data.frame( density(.r)[c("x", "y")], n = 100 )
.qq$colour <- rep(ifelse(length(.r)>50, "black", "red"), nrow(.qq))
return( .qq )
}
ck + l_glyphs2D(glyFun = glyFun, ggLay = "geom_path", n = c(8, 8),
mapping = aes(x=gx, y=gy, group = gid, colour = I(colour)),
height=1.5, width = 1)
```
Each glyph represend a kernel density of the residuals, with colours indicating whether we have more (black) or less (red) that 50 observations in that bin. It is clear that the residuals are much less variable for $x \approx 0$ than elsewhere. We can do the same using binned worm-plots:
```{r 38}
glyFun <- function(.d){
n <- nrow(.d)
px <- qnorm( (1:n - 0.5)/(n) )
py <- sort( .d$z )
clr <- if(n > 50) { "black" } else { "red" }
clr <- rep(clr, n)
return( data.frame("x" = px, "y" = py - px, "colour" = clr))
}
ck + l_glyphs2D(glyFun = glyFun, ggLay = "geom_point", n = c(10, 10),
mapping = aes(x=gx, y=gy, group = gid, colour = I(colour)),
height=2, width = 1, size = 0.2)
```
Notice that worm-plots (Buuren and Fredriks, 2001) are simply rotated QQ-plots. An horizontal plot indicates well specified residual model. An increasing (decreasing) worm indicates over (under) dispersion.
# Special plots
## Differences-between-smooths plots
Plotting the difference between two smooth effects can be useful when working with by-factor
smooths. For example, here we are fitting a model containing a smooth along `x2` for each value
of the `fac` factor variable:
```{r 39, results='hide'}
set.seed(6898)
dat <- gamSim(1,n=500,dist="normal",scale=20)
dat$fac <- as.factor( sample(c("A1", "A2", "A3"), nrow(dat), replace = TRUE) )
bs <- "cr"; k <- 12
b <- gam(y ~ s(x2,bs=bs,by = fac), data=dat)
b <- getViz(b)
```
We can plot the difference between the smooths corresponding to levels `"A1"` and `"A2"`,
by extracting the two smooths and then feeding them to the `plotDiff` generic:
```{r 39a}
plotDiff(s1 = sm(b, 1), s2 = sm(b, 2)) + l_ciPoly() +
l_fitLine() + geom_hline(yintercept = 0, linetype = 2)
```
Notice that the credible intervals for the difference smooth produced by `plotDiff` take into account the cross-covariance between the two smooths.
## Plotting multiple slices of multi-dimensional smooth effects
When dealing with smooth effects defined on more than two dimensions, it is often useful to visualize
them as a sequence of 2D slices. The `plotSlice` function provides such functionality, by exploiting the
faceting tools offerered by `ggplot2`. For instance, here we are slicing a 4D smooth effect along two variables:
```{r 40}
n <- 1e3
x <- rnorm(n); y <- rnorm(n); z <- rnorm(n); z2 <- rnorm(n)
ob <- (x-z)^2 + (y-z)^2 + z2^3 + rnorm(n)
b <- gam(ob ~ s(x, y, z, z2))
v <- getViz(b)
# Plot slices across "z" and "x"
pl <- plotSlice(x = sm(v, 1),
fix = list("z" = seq(-2, 2, length.out = 3), "x" = c(-1, 0, 1)))
pl + l_fitRaster() + l_fitContour() + l_points() + l_rug()
```
References
==========
- Buuren, S. v. and Fredriks, M. (2001) Worm plot: a simple diagnostic device for modelling
growth reference curves, Statistics in medicine, 20, 1259–1277.
- Fasiolo, M., Nedellec, R., Goude, Y. and Wood, S.N., 2019. Scalable visualization methods for modern generalized additive models. Journal of computational and Graphical Statistics, pp.1-9.
- Murdoch, D. (2001) Rgl: An r interface to opengl, in Proceedings of DSC, p. 2.
- Wand, M. P. (1994) Fast computation of multivariate kernel estimators, Journal of Computational and Graphical Statistics, 3, 433–445
- Wickham, H. (2009) ggplot2: Elegant Graphics for Data Analysis, Springer-Verlag New York.
- Wickham, H. (2010) A layered grammar of graphics, Journal of
Computational and Graphical Statistics, 19, 3–28.
- Wood, S.N. (2017) Generalized Additive Models: An Introduction with R (2nd edition).
Chapman and Hall/CRC.