Granger test with R.Rmd

---
title: "Four ways to perform a Granger Causality test with R"
output: github_document
---

The Granger causal test is widely used in time series analysis, but the various ways in which it may be performed in R are poorly documented. I published before a function to perform the Toda-Yamamoto bivariate and multivariate versions of the Granger causality test (see at  [Toda-Yamamoto-Causality-Test](https://github.com/nicolarighetti/Toda-Yamamoto-Causality-Test/blob/main/README.md)). In the few lines that follow, I am going to focus on its standard version, suggesting four different approaches.

For starters, a definition. A time series X is said to Granger cause another time series Y if past values of X and Y predict Y significantly better than past values of Y only. (Granger, 1969, click  [here](https://github.com/nicolarighetti/Toda-Yamamoto-Causality-Test/blob/main/README.md) for other details). We can test that hypothesis using linear regression modeling and a Wald test for linear restrictions. The Wald test compares the performance of a restricted model for Y, which excludes X, to a non-restricted model for Y, which includes X.

Firstly, let's create two simulated time series. It is important to note that the data are represented as time series using the base R *ts* function. Some functions described here do not work well if the data is not in this format. Also note that the Granger causality test has several assumptions. I'm not going to go into that. I just add that it's based on linear modeling, so besides specific assumptions of the test, the usual linear regression assumptions apply to it.

```{r}
set.seed(1234)

ts1 <- ts(rnorm(n=5000))
ts2 <- ts(rnorm(n=5000))
```

## First approach: lmtest::grangertest

The first way to perform a Granger causality test in R is to use the *grangertest* function provided by the [lmtest](https://cran.r-project.org/web/packages/lmtest/index.html) package. We can test if the series *ts1* Granger-causes the series *t2* with a simple line of code. It turns out the *ts1* does not Granger-cause *ts2*.

```{r}
lmtest::grangertest(ts1, ts2, order=3, test="F")
```

While *grangertest* is a quick way to perform a Granger causal test, it is quite rough, as it does not rely on any modelling of the series. It is important to fit an appropriate linear model to the series, also just to check if the assumptions hold.

## Second approach: dynlm::dynlm + lmtest::waldtest

A second, more flexible approach is based on performing a Wald test on two fitted linear regression models. As defined in the Granger causality test, linear regression models include past lags of the variables. This type of model can be easily fitted with the function [dynlm](https://cran.r-project.org/web/packages/dynlm/index.html) of the homonymous R package. Next, if the assumptions of the linear models and Granger test hold, a Wald test can be performed by using the function *waldtest* of the already mentioned library *lmtest*.

```{r message=FALSE, warning=FALSE}
library(dynlm)
```

```{r}
unrestricted_ts2 <- dynlm(ts2 ~ L(ts2, 1:3) + L(ts1, 1:3))
restricted_ts2 <- dynlm(ts2 ~ L(ts2, 1:3))

lmtest::waldtest(unrestricted_ts2, restricted_ts2, test="F")                  
```

This approach has the advantage that the series can be modeled in a flexible manner before testing Granger's causality. For instance, by adding a seasonal term or other variables when necessary.

Moreover, this approach permits the specification of heteroscedasticity-consistent (HC) or heteroscedasticity and autocorrelation consistent standard errors (HAC), also in the Newey West form, for instance by using the functions provided by the [sandwich](https://cran.r-project.org/web/packages/sandwich/index.html) package.

```{r}
lmtest::waldtest(unrestricted_ts2, restricted_ts2, test="F", vcov = sandwich::vcovHC) 
lmtest::waldtest(unrestricted_ts2, restricted_ts2, test="F", vcov = sandwich::vcovHAC) 
lmtest::waldtest(unrestricted_ts2, restricted_ts2, test="F", vcov = sandwich::NeweyWest) 
```

## Third approach: dynlm::dynlm + aod::wald.test

The third approach is a version of the latter. A Wald test can also be performed with the wald.test function in the [aod](https://cran.r-project.org/web/packages/aod/index.html) package. This is trickier because you have to manually specify the terms to be tested. However, this extra flexibility may be useful in certain cases. By default, *wald.test* will calculate a Wald Chi-square test. The Chi-squared version of the test can be also computed with the *waldtest* function of *lmtest* by specifying the option test="Chisq". Similarly, the F or Chi-squared test can be performed by using the the function *linearHypothesis* in the [car](https://cran.r-project.org/web/packages/car/index.html) package.

```{r}
aod::wald.test(b=coef(unrestricted_ts2), 
               Sigma=vcov(unrestricted_ts2),
               Terms = 5:7,
               verbose = T)
```

## Fourth approach: vars::VAR + vars::causality

The fourth way to perform a Granger causality test in R is probably the most commonly used, and consists in using the package [vars](https://cran.r-project.org/web/packages/vars/index.html) to fit a VAR model (which is basically a system of linear regression models) and subsequently, run the Granger test using the function *causality* provided by the same package.

```{r}
tsDat <- ts.union(ts1, ts2) 
tsVAR <- vars::VAR(tsDat, p = 3)
vars::causality(tsVAR, cause = "ts1")$Granger
```

The *causation* function offers several interesting options. For example, it implements a bootstrapping procedure. It is also possible to specify the heteroscedasticity-consistent estimation of the covariance matrix of the coefficient estimates in the regression models (HC), but other types of estimation of the covariance matrix (e.g., HAC) are not supported.