Prediction_CSC.Rmd

---
title: "External validation of the performance of competing risks prediction models: a guide through modern methods -  Cause specific hazard models"
always_allow_html: true
output:
  github_document:
    toc: true
    toc_depth: 5
  keep_text: true
  pandoc_args: --webtex
---
  
## Steps
  
The steps taken in this file are:   
1. To develop a competing risks prediction model using the cause specific hazards approach.
2. To assess the performance of the model in terms of calibration, discrimination and overall prediction error. We calculate the apparent (no internal validation) validation and the external validation.
3. To assess the potential clinical utility the model using decision curve analysis.

### Installing and loading packages and import data

The following libraries are used in this file, the code chunk below will a) check whether you already have them installed, b) install them for you if not already present, and c) load the packages into the session.

```{r setup, include=FALSE}
# Knitr options
knitr::opts_chunk$set(
  fig.retina = 3,
  fig.path = "imgs/Prediction_CSC/",
  echo = FALSE
)
```

```{r, wdlib, message=FALSE, warning=FALSE, echo=TRUE}
# Use pacman to check whether packages are installed, if not load
if (!require("pacman")) install.packages("pacman")
library(pacman)

pacman::p_load(
  survival,
  rms,
  cmprsk,
  riskRegression,
  mstate,
  pseudo,
  pec,
  riskRegression,
  plotrix,
  knitr,
  splines,
  kableExtra,
  gtsummary,
  boot,
  tidyverse,
  rsample,
  gridExtra,
  webshot
)

# Import data ------------------
rdata <- readRDS(here::here("Data/rdata.rds"))
vdata <- readRDS(here::here("Data/vdata.rds"))

rdata$hr_status <- relevel(rdata$hr_status, ref = "ER and/or PR +")
vdata$hr_status <- relevel(vdata$hr_status, ref = "ER and/or PR +")
```

We loaded the development data (rdata) and the validation data (vdata).
More details about development and validation data are provided in the manuscript.

### Descriptive statistics

```{r, import}
rsel <- rdata[, c("id", "age", "size", "ncat", "hr_status")]
vsel <- vdata[, c("id", "age", "size", "ncat", "hr_status")]
rsel$dt <- 1
vsel$dt <- 2
cdata <- rbind(rsel, vsel)
cdata$dt <- factor(
  cdata$dt,
  levels = c(1, 2),
  labels = c("Development data", "Validation data")
)
label(cdata$age) <- "Age"
label(cdata$size) <- "Size"
label(cdata$ncat) <- "Nodal status"
label(cdata$hr_status) <- "Hormon receptor status"
units(cdata$age) <- "years"
units(cdata$size) <- "cm"
```


```{r tab1_bis}
gtsummary::tbl_summary(
  data = cdata |> select(-id),
  label = list(age ~ "Age (years)", size ~ "Size (cm)"),
  by = "dt",
  type = all_continuous() ~ "continuous2",
  statistic = all_continuous() ~ c(
    "{mean} ({sd})",
    "{median} ({min}, {max})"
  ),
) |>
  gtsummary::as_kable_extra() |>
  kableExtra::kable_styling("striped")
```

## Goal 1 - develop a competing risks prediction model

### 1.1 Cumulative incidence curves

First, we draw the cumulative incidence curves of breast cancer recurrence.

<details>
  <summary>Click to expand code</summary>
```{r, cuminc, fig.align='center', eval=FALSE, echo=TRUE}
# Expand datasets -------------------------

# Expand data to prepare for fitting the model
rdata.w <- crprep(
  Tstop = "time",
  status = "status_num",
  trans = c(1, 2),
  id = "id",
  keep = c("age", "size", "ncat", "hr_status"),
  data = rdata
)
# Save extended data with weights for recurrence (failcode=1)
# and non recurrence mortality (failcode=2)
rdata.w1 <- rdata.w |> filter(failcode == 1)
rdata.w2 <- rdata.w |> filter(failcode == 2)
vdata.w <- crprep(
  Tstop = "time",
  status = "status_num",
  trans = c(1, 2),
  id = "id",
  keep = c("age", "size", "ncat", "hr_status"),
  data = vdata
)
vdata.w1 <- vdata.w |> filter(failcode == 1)
vdata.w2 <- vdata.w |> filter(failcode == 2)

# Development set --------
mfit_rdata <- survfit(
  Surv(Tstart, Tstop, status == 1) ~ 1,
  data = rdata.w1, 
  weights = weight.cens
)
mfit_vdata <- survfit(
  Surv(Tstart, Tstop, status == 1) ~ 1,
  data = vdata.w1, 
  weights = weight.cens
)
par(xaxs = "i", yaxs = "i", las = 1)
oldpar <- par(mfrow = c(1, 2), mar = c(5, 5, 1, 1))
plot(
  mfit_rdata,
  col = 1, lwd = 2,
  xlab = "Years since BC diagnosis",
  ylab = "Cumulative incidence recurrence", bty = "n",
  ylim = c(0, 0.25), 
  xlim = c(0, 5), 
  fun = "event", 
  conf.int = TRUE
)
title("Development data")
plot(
  mfit_vdata,
  col = 1, lwd = 2,
  xlab = "Years since BC diagnosis",
  ylab = "Cumulative incidence recurrence", bty = "n",
  ylim = c(0, 0.25), 
  xlim = c(0, 5), 
  fun = "event",
  conf.int = TRUE
)
title("Validation data")
par(oldpar)

# Cumulative incidences
smfit_rdata <- summary(mfit_rdata, times = c(1, 2, 3, 4, 5))
smfit_vdata <- summary(mfit_vdata, times = c(1, 2, 3, 4, 5))
```
</details>

```{r, cuminc, fig.align='center', eval=TRUE}
```

The R packages and functions `cmprsk::cuminc()` and `mstate::Cuminc()`are good and easy alternatives to estimate the cumulative incidence function. 

```{r, res_ci, fig.align='center'}
res_ci <- cbind(
  1 - smfit_rdata$surv,
  1 - smfit_rdata$upper,
  1 - smfit_rdata$lower,
  1 - smfit_vdata$surv,
  1 - smfit_vdata$upper,
  1 - smfit_vdata$lower
)
res_ci <- round(res_ci, 2)
rownames(res_ci) <- c("1-year", "2-year", "3-year", "4-year", "5-year")
colnames(res_ci) <- rep(c("Estimate", "Lower .95", "Upper .95"), 2)
kable(res_ci, row.names = TRUE) |>
  kable_styling("striped", position = "center") |>
  add_header_above(c(" " = 1, "Development data" = 3, "Validation data" = 3))
```

The 5-year cumulative incidence of breast cancer recurrence was 14% (95% CI: 11-16%), and 10% (95%CI: 8-12%).

### 1.2 Check non-linearity of continuous predictors

Here we investigate the potential non-linear relation between continuous predictors (i.e. age and size) and the outcomes. We apply three-knot restricted cubic splines using `rms::rcs()` function (details are given in e.g. Frank Harrell's book 'Regression Model Strategies (second edition)', page 27. We assess the potential non-linearity graphically (plotting the two continuous predictors against the log relative hazards (XB or linear predictor) of both event types. Also, we compare the models with and without splines based on the AIC using `stats::AIC()`.

<details>
  <summary>Click to expand code</summary>
```{r,ff, warning=FALSE, fig.align='center', eval=FALSE, echo=TRUE}
# Models without splines
fit_csh <- CSC(
  Hist(time, status_num) ~ age + size + ncat + hr_status,
  data = rdata,
  fitter = "cph"
)

fit_csc1 <- fit_csh$models$`Cause 1`
fit_csc2 <- fit_csh$models$`Cause 2`

# Models with splines
dd <- datadist(rdata)
options(datadist = "dd")

# Recurrence
fit_csc1_rcs <- cph(
  Surv(time, status_num == 1) ~ rcs(age, 3) + rcs(size, 3) + ncat + hr_status,
  x = T,
  y = T,
  surv = T,
  data = rdata
)
# print(fit_csc1_rcs)
# print(summary(fit_csc1_rcs))
# print(anova(fit_csc1_rcs))
P_csc1_age_rcs <- Predict(fit_csc1_rcs, "age")
P_csc1_size_rcs <- Predict(fit_csc1_rcs, "size")
options(datadist = NULL)
# Non-recurrence mortality
dd <- datadist(rdata)
options(datadist = "dd")
fit_csc2_rcs <- cph(
  Surv(time, status_num == 2) ~ rcs(age, 3) + rcs(size, 3) + ncat + hr_status,
  x = T,
  y = T,
  surv = T,
  data = rdata
)
# print(fit_csc2_rcs)
# print(summary(fit_csc2_rcs))
# print(anova(fit_csc2_rcs))
P_csc2_age_rcs <- Predict(fit_csc2_rcs, "age")
P_csc2_size_rcs <- Predict(fit_csc2_rcs, "size")
options(datadist = NULL)
oldpar <- par(mfrow = c(2, 2), mar = c(5, 5, 1, 1))
par(xaxs = "i", yaxs = "i", las = 1)
plot(
  P_csc1_age_rcs$age,
  P_csc1_age_rcs$yhat,
  type = "l",
  lwd = 2,
  col = "blue",
  bty = "n",
  xlab = "Age at breast cancer diagnosis",
  ylab = "log Relative Hazard",
  ylim = c(-2, 2),
  xlim = c(65, 95)
)
polygon(
  c(P_csc1_age_rcs$age, rev(P_csc1_age_rcs$age)),
  c(P_csc1_age_rcs$lower, rev(P_csc1_age_rcs$upper)),
  col = "grey75",
  border = FALSE
)
par(new = TRUE)
plot(
  P_csc1_age_rcs$age,
  P_csc1_age_rcs$yhat,
  type = "l",
  lwd = 2,
  col = "blue",
  bty = "n",
  xlab = "Age at breast cancer diagnosis",
  ylab = "log Relative Hazard",
  ylim = c(-2, 2),
  xlim = c(65, 95)
)
title("Recurrence")
# CSC 1- size
par(xaxs = "i", yaxs = "i", las = 1)
plot(
  P_csc1_size_rcs$size,
  P_csc1_size_rcs$yhat,
  type = "l",
  lwd = 2,
  col = "blue",
  bty = "n",
  xlab = "Size of breast cancer",
  ylab = "log Relative Hazard",
  ylim = c(-2, 2),
  xlim = c(0, 7)
)
polygon(
  c(P_csc1_size_rcs$size, rev(P_csc1_size_rcs$size)),
  c(P_csc1_size_rcs$lower, rev(P_csc1_size_rcs$upper)),
  col = "grey75",
  border = FALSE
)
par(new = TRUE)
plot(
  P_csc1_size_rcs$size,
  P_csc1_size_rcs$yhat,
  type = "l",
  lwd = 2,
  col = "blue",
  bty = "n",
  xlab = "Size of breast cancer",
  ylab = "log Relative Hazard",
  ylim = c(-2, 2),
  xlim = c(0, 7)
)
title("Recurrence")
par(xaxs = "i", yaxs = "i", las = 1)
options(datadist = NULL)
# CSC 2- age
plot(
  P_csc2_age_rcs$age,
  P_csc2_age_rcs$yhat,
  type = "l",
  lwd = 2,
  col = "blue",
  bty = "n",
  xlab = "Age at breast cancer diagnosis",
  ylab = "log Relative Hazard",
  ylim = c(-2, 2),
  xlim = c(65, 95)
)
polygon(
  c(P_csc2_age_rcs$age, rev(P_csc2_age_rcs$age)),
  c(P_csc2_age_rcs$lower, rev(P_csc2_age_rcs$upper)),
  col = "grey75",
  border = FALSE
)
par(new = TRUE)
plot(
  P_csc2_age_rcs$age,
  P_csc2_age_rcs$yhat,
  type = "l",
  lwd = 2,
  col = "blue",
  bty = "n",
  xlab = "Age at breast cancer diagnosis",
  ylab = "log Relative Hazard",
  ylim = c(-2, 2),
  xlim = c(65, 95)
)
title("Non recurrence mortality")
# CSC 2 - size
par(xaxs = "i", yaxs = "i", las = 1)
plot(
  P_csc2_size_rcs$size,
  P_csc2_size_rcs$yhat,
  type = "l",
  lwd = 2,
  col = "blue",
  bty = "n",
  xlab = "Size of breast cancer",
  ylab = "log Relative Hazard",
  ylim = c(-2, 2),
  xlim = c(0, 7)
)
polygon(
  c(P_csc2_size_rcs$size, rev(P_csc2_size_rcs$size)),
  c(P_csc2_size_rcs$lower, rev(P_csc2_size_rcs$upper)),
  col = "grey75",
  border = FALSE
)
par(new = TRUE)
plot(
  P_csc2_size_rcs$size,
  P_csc2_size_rcs$yhat,
  type = "l",
  lwd = 2,
  col = "blue",
  bty = "n",
  xlab = "Size of breast cancer",
  ylab = "log Relative Hazard",
  ylim = c(-2, 2),
  xlim = c(0, 7)
)
title("Non recurrence mortality")
options(datadist = NULL)
par(oldpar)
```
</details>

```{r,ff, warning=FALSE, fig.align='center', eval=TRUE}
```

```{r, res_aic, fig.align='center'}
res_AIC <- matrix(c(
  AIC(fit_csc1), AIC(fit_csc1_rcs),
  AIC(fit_csc2), AIC(fit_csc2_rcs)
),
byrow = T,
ncol = 2,
nrow = 2,
dimnames =
  list(
    c(
      "Recurrence specific hazard",
      "Non recurrence mortality"
    ),
    c(
      "AIC without splines",
      "AIC with splines"
    )
  )
)
kable(res_AIC, row.names = TRUE) |>
  kable_styling("striped", position = "center")
```

Both the graphical comparison and the AIC comparison suggested no relevant departure from linear relations between the continuous predictors (age and size) and the cause-specific hazards (recurrence and non-recurrence mortality).  

### 1.3 Checking proportional hazards assumption

We now examine the fits further by checking the proportionality of the cause-specific hazards.

<details>
  <summary>Click to expand code</summary>
```{r,ph_csc1, message=FALSE, warning=FALSE,fig.align='center', echo=TRUE, eval=FALSE}
zp_csc1 <- cox.zph(fit_csc1, transform = "identity")
par(las = 1, xaxs = "i", yaxs = "i")
# c(bottom, left, top, right)
oldpar <- par(mfrow = c(2, 2), mar = c(5, 6.1, 3.1, 1))
sub_title <- c("Age", "Size", "Lymph node status", "HR status")
for (i in 1:4) {
  plot(
    zp_csc1[i],
    resid = F,
    bty = "n",
    xlim = c(0, 5)
  )
  abline(0, 0, lty = 3)
  title(sub_title[i])
}
mtext("Recurrence", side = 3, line = -1, outer = TRUE, font = 2)
par(oldpar)
kable(round(zp_csc1$table, 3)) |>
  kable_styling("striped", position = "center")
```
</details>

```{r,ph_csc1, message=FALSE, warning=FALSE,fig.align='center', echo=FALSE, eval=TRUE}
```

<details>
  <summary>Click to expand code</summary>
```{r,ph_csc2, message=FALSE, warning=FALSE,fig.align='center',echo=TRUE,eval=FALSE}
zp_csc2 <- cox.zph(fit_csc2, transform = "identity")
par(las = 1, xaxs = "i", yaxs = "i")
# c(bottom, left, top, right)
oldpar <- par(mfrow = c(2, 2), mar = c(5, 6.1, 3.1, 1))
sub_title <- c("Age", "Size", "Lymph node status", "HR status")
for (i in 1:4) {
  plot(
    zp_csc2[i],
    resid = F,
    bty = "n",
    xlim = c(0, 5)
  )
  abline(0, 0, lty = 3)
  title(sub_title[i])
}
mtext(
  "Non-recurrence mortality",
  side = 3,
  line = -1,
  outer = TRUE,
  font = 2
)
par(oldpar)
kable(round(zp_csc2$table, 3)) |>
  kable_styling("striped", position = "center")
```
</details>

```{r,ph_csc2, message=FALSE, warning=FALSE,fig.align='center',echo=FALSE,eval=TRUE}
```

The statistical tests showed a potential violation of the proportional hazards assumption for nodal status in the model for non-recurrence mortality. For simplicity we ignore this violation in the remainder.

### 1.4 Examine the fit of the models

+ Cox proportional hazard model for recurrence

```{r, summary_csc1, fig.align='center',warning=FALSE}
dd <- datadist(rdata)
options(datadist = "dd")
options(prType = "html")
fit_csc1_cph <- cph(
  Surv(time, status_num == 1) ~ age + size + ncat + hr_status,
  x = T,
  y = T,
  surv = T,
  data = rdata
)
print(fit_csc1_cph)
# print(summary(fit_csc1_cph))
options(datadist = NULL)
```

+ Cox proportional hazard model for non recurrence mortality

```{r, summary_csc2, fig.align='center',warning=FALSE}
dd <- datadist(rdata)
options(datadist = "dd")
options(prType = "html")
fit_csc2_cph <- cph(
  Surv(time, status_num == 2) ~ age + size + ncat + hr_status,
  x = T,
  y = T,
  surv = T,
  data = rdata
)
print(fit_csc2_cph)
# print(summary(fit_csc2_cph))
options(datadist = NULL)
```

The coefficients of the models indicated that larger tumor size, positive nodal status and negative hormone receptor status status were associated with higher risk to develop a breast cancer recurrence, while older patients and larger tumors are associated with higher risk of non recurrence mortality. The associations were considered statistically significant at the usual alpha = 0.05.  

### 1.5 Plot of predictors vs estimated risk at 5 years in the validation data

To get further insight into the effect of the covariates, we plot the covariate values observed in the validation set against the estimated absolute risk of breast cancer recurrence at 5 years. This gives an idea of the size of the effects.

<details>
  <summary>Click to expand code</summary>
```{r, plot_risk, fig.align='center',warning=FALSE, echo=TRUE, eval=FALSE}
# Models -------------
fit_csh <- CSC(
  formula = Hist(time, status_num) ~ age + size + ncat + hr_status,
  data = rdata
)

# External validation at 5 years
horizon <- 5

# Calculate predicted probabilities
vdata$pred <- predictRisk(
  fit_csh,
  cause = 1,
  newdata = vdata,
  times = horizon
)

# Age
oldpar <- par(mfrow = c(2, 2))
par(xaxs = "i", yaxs = "i", las = 1)
plot(vdata$age,
  vdata$pred,
  bty = "n",
  xlim = c(65, 100),
  ylim = c(0, .6),
  xlab = "Age, years",
  ylab = "Estimated risk"
)
lines(
  lowess(vdata$age, vdata$pred),
  col = "red",
  lwd = 2
)

# Size
par(xaxs = "i", yaxs = "i", las = 1)
plot(
  vdata$size,
  vdata$pred,
  bty = "n",
  xlim = c(0, 12),
  ylim = c(0, .6),
  xlab = "Size of tumor",
  ylab = "Estimated risk"
)
lines(
  lowess(vdata$size, vdata$pred),
  col = "red",
  lwd = 2
)

# HR status
par(xaxs = "i", yaxs = "i", las = 1)
plot(
  vdata$hr_status,
  vdata$pred,
  ylim = c(0, .6),
  bty = "n",
  xlab = "Receptor status",
  ylab = "Estimated risk"
)

# Nodal status
par(xaxs = "i", yaxs = "i", las = 1)
plot(
  vdata$ncat,
  vdata$pred,
  ylim = c(0, .6),
  bty = "n",
  xlab = "Nodal status",
  ylab = "Estimated risk"
)
par(oldpar)
```
</details>

```{r, plot_risk, fig.align='center',warning=FALSE, echo=FALSE, eval=TRUE}
```

## Goal 2 - Assessing performance of a competing risks prediction model

Here we evaluate the performance of the prediction model in terms of calibration, discrimination and overall prediction error. 


### 2.1 Calibration

We assess calibration by:

- The calibration plot as a graphical representation of calibration using
 the pseudo observations and the subdistribution hazard approach
 - Numerical summaries of calibration:  
 - The observed vs expected ratio (O/E ratio) ;  
 - The squared bias, i.e., the average squared difference between primary event indicators and risk predictions;
 - The integrated Calibration Index (ICI), i.e., the average absolute difference between primary event indicators and risk predictions;  
 - E50, E90 and Emax denote the median, 90th percentile and the maximum of the absolute differences between primary event indicators and risk predictions;  
 - Calibration intercept/slope estimated using pseudo observations:  
 - If on average the risk estimates equal the primary event indicators, the calibration intercept will be zero. A negative calibration intercept indicates that the risk estimates are on average too high and a positive intercept indicates that the risk estimates are on average too low.   
 - A calibration slope between 0 and 1 indicates overfitting of the model, i.e., too extreme predictions, both on the low and on the high         end. A calibration slope >1 indicates predictions do not show enough variation.  
      
#### 2.1.1 Calibration using pseudo observations

We calculate calibration plot and numerical summaries of calibration as
ICI, E50, E90, Emax and root squared bias using pseudo value approach.

##### 2.1.1.1 Calibration plot using pseudo observations

<details>
  <summary>Click to expand code</summary>
```{r, cal, fig.align='center',warning=FALSE,message=FALSE,eval=FALSE,echo=TRUE, fig.width=5, fig.height=5}
# Models ----------
fit_csh <- CSC(Hist(time, status_num) ~
age + size +
  ncat + hr_status,
data = rdata,
fitter = "cph"
)

# useful objects
primary_event <- 1 # Set to 2 if cause 2 was of interest
horizon <- 5 # Set time horizon for prediction (here 5 years)

# Predicted risk estimation
pred <- predictRisk(fit_csh,
  cause = primary_event,
  times = horizon,
  newdata = vdata
)


# Calibration plot (pseudo-obs approach) ----------------------------------
# First compute riskRegression::Score()
score_vdata <- Score(
  list("csh_validation" = fit_csh),
  formula = Hist(time, status_num) ~ 1,
  cens.model = "km",
  data = vdata,
  conf.int = TRUE,
  times = horizon,
  #  metrics = c("auc", "brier"),
  summary = c("ipa"),
  cause = primary_event,
  plots = "calibration"
)

calplot_pseudo <- plotCalibration(
  x = score_vdata,
  brier.in.legend = FALSE,
  auc.in.legend = FALSE,
  cens.method = "pseudo",
  bandwidth = 0.05, # leave as NULL for default choice of smoothing
  cex = 1,
  round = FALSE, # Important, keeps all unique risk estimates rather than rounding
  xlim = c(0, 0.6),
  ylim = c(0, 0.6),
  rug = TRUE,
  xlab = "Predictions",
  bty = "n"
)
title("Calibration plot using pseudo observations")
```
</details>

```{r, cal, fig.align='center',warning=FALSE,message=FALSE,eval=TRUE,echo=FALSE, fig.width=5, fig.height=5}
```

Calibration plot suggests that the prediction model seems to overestimate the observed outcome proportion of the primary event, especially at the lower and higher values of the estimated risk.

##### 2.1.1.2 Numerical summaries of calibration using pseudo observations

<details>
  <summary>Click to expand code</summary>
```{r, num_cal_pseudo, fig.align='center',warning=FALSE,message=FALSE,echo=TRUE,eval=TRUE}

# We can extract predicted and observed, observed will depend on degree of smoothing (bandwidth)
dat_pseudo <- calplot_pseudo$plotFrames$csh_validation

# Calculate difference between predicted and observed (make sure to use all estimated risks, not just unique ones)
diff_pseudo <- pred - dat_pseudo$Obs[match(pred, dat_pseudo$Pred)]

# Collect all numerical summary measures
numsum_pseudo <- c(
  "ICI" = mean(abs(diff_pseudo)),
  setNames(quantile(abs(diff_pseudo), c(0.5, 0.9)), c("E50", "E90")),
  "Emax" = max(abs(diff_pseudo)),
  "Root squared bias" = sqrt(mean(diff_pseudo^2))
)
```
</details>


```{r, res_cal_pseudo, fig.align='center',warning=FALSE,message=FALSE,echo=FALSE}
# rootmeansquared, E50, E90, Emax, int, slope,
k <- 4
res_num <- rbind(
  "Calibration measures - pseudo observations" = numsum_pseudo
)

res_num <- round(res_num, k)
kable(res_num) |>
  kable_styling("striped", position = "center")
```
Numerical calibration measures identified overestimation of the risk especially in the higher values of the observed outcome proportion of the primary event.

#### 2.1.2 Calibration using the subdistribution hazard approach
Here we assess calibration using the subdistribution hazard approach (Austin et al.)

##### 2.1.2.1 Calibration plot using the subdistribution hazard approach

<details>
  <summary>Click to expand code</summary>
```{r, plot_sd, fig.align='center',warning=FALSE,message=FALSE,eval=FALSE,echo=TRUE, fig.width=5, fig.height=5}
# Models ----------
fit_csh <- CSC(Hist(time, status_num) ~
age + size +
  ncat + hr_status,
data = rdata,
fitter = "cph"
)


# useful objects
primary_event <- 1 # Set to 2 if cause 2 was of interest
horizon <- 5 # Set time horizon for prediction (here 5 years)


# Calibration plot (flexible regression approach) -------------------------

# Add estimated risk and complementary log-log of it to dataset
vdata$pred <- predictRisk(fit_csh,
  cause = primary_event,
  newdata = vdata,
  times = horizon
)
vdata$cll_pred <- log(-log(1 - pred))

# 5 knots seems to give somewhat equivalent graph to pseudo method with bw = 0.05
n_internal_knots <- 5 # Austin et al. advise to use between 3 (more smoothing, less flexible) and 5 (less smoothing, more flexible)
rcs_vdata <- ns(vdata$cll_pred, df = n_internal_knots + 1)
colnames(rcs_vdata) <- paste0("basisf_", colnames(rcs_vdata))
vdata_bis <- cbind.data.frame(vdata, rcs_vdata)

# Use subdistribution hazards (Fine-Gray) model
form_fgr <- reformulate(
  termlabels = colnames(rcs_vdata),
  response = "Hist(time, status_num)"
)

# Regress subdistribution of event of interest on cloglog of estimated risks
calib_fgr <- FGR(
  formula = form_fgr,
  cause = primary_event,
  data = vdata_bis
)

# Add observed and predicted together in a data frame
dat_fgr <- cbind.data.frame(
  "obs" = predict(calib_fgr, times = horizon, newdata = vdata_bis),
  "pred" = vdata$pred
)

# Calibration plot
dat_fgr <- dat_fgr[order(dat_fgr$pred), ]
par(xaxs = "i", yaxs = "i", las = 1)
plot(
  x = dat_fgr$pred,
  y = dat_fgr$obs,
  type = "l",
  xlim = c(0, 0.6),
  ylim = c(0, 0.6),
  xlab = "Predictions",
  ylab = "Observed outcome proportion",
  bty = "n"
)
abline(a = 0, b = 1, lty = "dashed", col = "red")
title("Calibration plot using subdistribution hazard approach", 
      cex.main = .90)
```
</details>

```{r, plot_sd, fig.align='center',warning=FALSE,message=FALSE,echo=FALSE,eval=TRUE, fig.width=5, fig.height=5}
```

Calibration plot suggests that the prediction model seems to overestimate the observed outcome proportion of the primary event, especially at the lower and higher values of the estimated risk.

##### 2.1.2.2 Numerical summaries of calibration using the subdistribution hazard approach
Here we assess calibration using the subdistribution hazard approach (Austin et al.)

<details>
  <summary>Click to expand code</summary>
```{r, num_cal_sd, fig.align='center',warning=FALSE,message=FALSE,eval=FALSE, echo=TRUE}
# Numerical summary measures
diff_fgr <- dat_fgr$pred - dat_fgr$obs

numsum_fgr <- c(
  "ICI" = mean(abs(diff_fgr)),
  setNames(quantile(abs(diff_fgr), c(0.5, 0.9)), c("E50", "E90")),
  "Emax" = max(abs(diff_fgr)),
  "Root squared bias" = sqrt(mean(diff_fgr^2))
)
```
</details>

```{r, num_cal_sd, fig.align='center',warning=FALSE,message=FALSE,echo=FALSE,eval=TRUE, fig.width=5, fig.height=5}
```

```{r, num_cal_sd_out, fig.align='center',warning=FALSE,message=FALSE,echo=FALSE}
k <- 4
res_num <- rbind(
  "Calibration measures - subdistribution " = numsum_fgr
)

res_num <- round(res_num, k)
kable(res_num) |>
  kable_styling("striped", position = "center")
```
Numerical calibration measures identified overestimation of the risk especially in the higher values of the observed outcome proportion of the primary event.

#### 2.1.2.3 Calibration plot using pseudo-observations (LOESS smoothing)

<details>
  <summary>Click to expand code</summary>
```{r loess_pseudo, echo=TRUE, eval=FALSE, fig.width=5, fig.height=5, fig.align='center'}
# Use pseudo-observations calculated by Score() (can alternatively use pseudo::pseudoci)
pseudos <- data.frame(score_vdata$Calibration$plotframe)
pseudos <- pseudos[order(pseudos$risk), ]

# Use linear loess (weighted local regression with polynomial degree = 1) smoothing
smooth_pseudos <- predict(
  stats::loess(pseudovalue ~ risk, data = pseudos, degree = 1, span = 0.33), 
  se = TRUE
)

# Calibration plot (reported in manuscript):

# First, prepare histogram of estimated risks for x-axis
spike_bounds <- c(-0.075, 0)
bin_breaks <- seq(0, 0.6, length.out = 100 + 1)
freqs <- table(cut(pred, breaks = bin_breaks))
bins <- bin_breaks[-1]
freqs_valid <- freqs[freqs > 0]
freqs_rescaled <- spike_bounds[1] + (spike_bounds[2] - spike_bounds[1]) * 
  (freqs_valid - min(freqs_valid)) / (max(freqs_valid) - min(freqs_valid))

# Produce plot
par(xaxs = "i", yaxs = "i", las = 1)
plot(
  x = pseudos$risk, 
  y = pseudos$pseudovalue,
  xlim = c(0, 0.6), 
  ylim = c(spike_bounds[1], 0.6),
  yaxt = "n",
  frame.plot = FALSE,
  xlab = "Estimated risks",
  ylab = "Observed outcome proportions", 
  type = "n"
)
axis(2, seq(0, 0.6, by = 0.1), labels = seq(0, 0.6, by = 0.1))
polygon(
  x = c(pseudos$risk, rev(pseudos$risk)),
  y = c(
    pmax(smooth_pseudos$fit - qt(0.975, smooth_pseudos$df) * smooth_pseudos$se, 0),
    rev(smooth_pseudos$fit + qt(0.975, smooth_pseudos$df) * smooth_pseudos$se)
  ),
  border = FALSE,
  col = "lightgray"
)
abline(a = 0, b = 1, col = "gray")
lines(x = pseudos$risk, y = smooth_pseudos$fit, lwd = 2)
segments(
  x0 = bins[freqs > 0], 
  y0 = spike_bounds[1], 
  x1 = bins[freqs > 0], 
  y1 = freqs_rescaled
)
```
</details>

```{r loess_pseudo, echo=FALSE, eval=TRUE, fig.width=5, fig.height=5, fig.align='center'}
```


#### 2.1.3 Observed and Expected ratio

<details>
  <summary>Click to expand code</summary>
```{r, oe_ratio, fig.align='center',warning=FALSE,message=FALSE,echo=TRUE}

# Models ----------
fit_csh <- CSC(Hist(time, status_num) ~
age + size +
  ncat + hr_status,
data = rdata,
fitter = "cph"
)


# useful objects
primary_event <- 1 # Set to 2 if cause 2 was of interest
horizon <- 5 # Set time horizon for prediction (here 5 years)

# Add estimated risk and complementary log-log of it to dataset
pred <- predictRisk(fit_csh,
  cause = primary_event,
  newdata = vdata,
  times = horizon
)

## Observed/Expected ratio --------------------------------------------
# First calculate Aalen-Johansen estimate (as 'observed')
obj <- summary(survfit(Surv(time, status) ~ 1,
  data = vdata
),
times = horizon
)

aj <- list(
  "obs" = obj$pstate[, primary_event + 1],
  "se" = obj$std.err[, primary_event + 1]
)


# Calculate O/E
OE <- aj$obs / mean(pred)

# For the confidence interval we use method proposed in Debray et al. (2017) doi:10.1136/bmj.i6460
k <- 2
alpha <- 0.05
OE_summary <- cbind(
  "OE" = OE,
  "Lower .95" = exp(log(OE) - qnorm(1 - alpha / 2) * aj$se / aj$obs),
  "Upper .95" = exp(log(OE) + qnorm(1 - alpha / 2) * aj$se / aj$obs)
)

OE_summary <- round(OE_summary, k)
```
</details>

```{r, res_OE, echo=FALSE}
# OE, rootmeansquared, E50, E90, Emax, int, slope,

kable(OE_summary) |>
  kable_styling("striped", position = "center")
```

Observed and expected ratio shown slight overestimation of the risk predicted by the model.

#### 2.1.4 Calibration intercept and slope using pseudo observations


<details>
  <summary>Click to expand code</summary>
```{r, cal_intslope_pseudo, fig.align='center',warning=FALSE,message=FALSE, echo=TRUE}

# Models ----------
fit_csh <- CSC(Hist(time, status_num) ~
age + size +
  ncat + hr_status,
data = rdata,
fitter = "cph"
)


# useful objects
primary_event <- 1 # Set to 2 if cause 2 was of interest
horizon <- 5 # Set time horizon for prediction (here 5 years)

# Predicted risk estimation
pred <- predictRisk(fit_csh,
  cause = primary_event,
  times = horizon,
  newdata = vdata
)


# Calibration plot (pseudo-obs approach) ----------------------------------
# First compute riskRegression::Score()
score_vdata <- Score(
  list("csh_validation" = fit_csh),
  formula = Hist(time, status_num) ~ 1,
  cens.model = "km",
  data = vdata,
  conf.int = TRUE,
  times = horizon,
  #  metrics = c("auc", "brier"),
  summary = c("ipa"),
  cause = primary_event,
  plots = "calibration"
)


## Calibration intercept and slope --------------------------------------
# Use pseudo-observations calculated by Score() (can alternatively use pseudo::pseudoci)
pseudos <- data.frame(score_vdata$Calibration$plotframe)

# Note:
# - 'pseudos' is the data.frame with ACTUAL pseudo-observations, not the smoothed ones
# - Column ID is not the id in vdata; it is just a number assigned to each row of
# the original validation data sorted by time and event indicator
pseudos$cll_pred <- log(-log(1 - pseudos$risk)) # add the cloglog risk ests

# Fit model for calibration intercept
fit_cal_int <- geese(
  pseudovalue ~ offset(cll_pred),
  data = pseudos,
  id = ID,
  scale.fix = TRUE,
  family = gaussian,
  mean.link = "cloglog",
  corstr = "independence",
  jack = TRUE
)

# Fit model for calibration slope
fit_cal_slope <- geese(
  pseudovalue ~ offset(cll_pred) + cll_pred,
  data = pseudos,
  id = ID,
  scale.fix = TRUE,
  family = gaussian,
  mean.link = "cloglog",
  corstr = "independence",
  jack = TRUE
)

# Perform joint test on intercept and slope
betas <- fit_cal_slope$beta
vcov_mat <- fit_cal_slope$vbeta
wald <- drop(betas %*% solve(vcov_mat) %*% betas)
# pchisq(wald, df = 2, lower.tail = FALSE)
```
</details>

```{r, res_cal_intslope, echo=FALSE}
k <- 2
res_cal <- rbind(

  # Value, confidence interval and test for calibration intercept
  "Intercept" = with(
    summary(fit_cal_int)$mean,
    c(
      "estimate" = estimate,
      `2.5 %` = estimate - qnorm(0.975) * san.se,
      `97.5 %` = estimate + qnorm(0.975) * san.se
    )
  ),

  # Value, confidence interval and test for calibration slope
  "Slope" = with(
    summary(fit_cal_slope)$mean["cll_pred", ],
    c(
      "estimate" = 1 + estimate,
      `2.5 %` = 1 + (estimate - qnorm(0.975) * san.se),
      `97.5 %` = 1 + (estimate + qnorm(0.975) * san.se)
    )
  )
)

res_cal <- round(res_cal, k)
kable(res_cal) |>
  kable_styling("striped", position = "center")
```

The calibration intercept was estimated at -0.15 [95% CI -0.36 to 0.05] also pointing towards slight overestimation (though not statistically significant). This number for example means that for an estimated risk of 30%, the expected actual risk is 1-0.7^(exp(-0.15))=26%. The calibration slope was 1.22 [95% CI 0.84 to 1.60], which would indicate too homogeneous predictions but the wide confidence interval precludes any firm conclusions from it. The p-value for the joint test on calibration intercept and slope was 0.09.

### 2.2 Discrimination

We here calculate

+ The 5-year C-index. More details are in the main manuscript and its references;
+ The 5-year time-dependent AUC. More details are in the manuscript and in its references;  
  + Plot time-dependent AUC over the time;
  
+ Royston-Sauerbrei D statistic and R<sup>2</sup><sub>D</sub> as a measure of prognostic separation;  

  
#### 2.2.1 C-index and time-dependent AUC

<details>
  <summary>Click to expand code</summary>
```{r, discrimination,warning=FALSE,message=FALSE,echo=TRUE}
# Models
fit_csh <- CSC(Hist(time, status_num) ~
age + size +
  ncat + hr_status,
data = rdata,
fitter = "cph"
)

# useful objects
primary_event <- 1 # Set to 2 if cause 2 was of interest
horizon <- 5 # Set time horizon for prediction (here 5 years)

# C-index
# Development set (Apparent validation)

C_rdata <- pec::cindex(
  object = fit_csh,
  formula = Hist(time, status_num) ~ 1,
  cause = primary_event,
  eval.times = horizon,
  data = rdata
)$AppCindex$CauseSpecificCox

# Validation set
C_vdata <- pec::cindex(
  object = fit_csh,
  formula = Hist(time, status_num) ~ 1,
  cause = primary_event,
  eval.times = horizon,
  data = vdata
)$AppCindex$CauseSpecificCox


# Bootstraping C-index to calculate the bootstrap percentile confidence intervals

B <- 100
set.seed(1234)
rboot <- bootstraps(rdata, times = B) # development - bootstrap
vboot <- bootstraps(vdata, times = B) # validation - bootstrap

C_boot <- function(split) {
  pec::cindex(
    object = fit_csh,
    formula = Hist(time, status_num) ~ 1,
    cause = primary_event,
    eval.times = horizon,
    data = analysis(split)
  )$AppCindex$CauseSpecificCox
}

# Run time-dependent AUC in the bootstrapped development and validation data
# to calculate the non-parametric CI through percentile bootstrap
rboot <- rboot |> mutate(
  C_rboot = map_dbl(splits, C_boot),
)
vboot <- vboot |> mutate(
  C_vboot = map_dbl(splits, C_boot),
)


# Time-dependent AUC ---------

# Development data
score_rdata <- Score(
  list("csh_development" = fit_csh),
  formula = Hist(time, status_num) ~ 1,
  cens.model = "km",
  data = rdata,
  conf.int = TRUE,
  times = horizon,
  metrics = c("auc"),
  cause = primary_event,
  plots = "calibration"
)

# Validation data
score_vdata <- Score(
  list("csh_validation" = fit_csh),
  formula = Hist(time, status_num) ~ 1,
  cens.model = "km",
  data = vdata,
  conf.int = TRUE,
  times = horizon,
  metrics = c("auc"),
  cause = primary_event,
  plots = "calibration"
)
```
</details>

```{r, res_disc, fig.align='center',echo=FALSE}
alpha <- .05
k <- 2
res_discr_csh <- matrix(c(
  ## C-index
  # Development CSH1
  C_rdata,
  quantile(rboot$C_rboot, probs = alpha / 2),
  quantile(rboot$C_rboot, probs = 1 - alpha / 2),

  # Validation CSH1
  C_vdata,
  quantile(vboot$C_vboot, probs = alpha / 2),
  quantile(vboot$C_vboot, probs = 1 - alpha / 2),

  ## Time-dependent AUC
  # Development CSH1
  score_rdata$AUC$score$AUC,
  score_rdata$AUC$score$AUC -
    qnorm(1 - alpha / 2) * score_rdata$AUC$score$se,
  score_rdata$AUC$score$AUC +
    qnorm(1 - alpha / 2) * score_rdata$AUC$score$se,

  # Validation CSH1
  score_vdata$AUC$score$AUC,
  score_vdata$AUC$score$AUC -
    qnorm(1 - alpha / 2) * score_vdata$AUC$score$se,
  score_vdata$AUC$score$AUC +
    qnorm(1 - alpha / 2) * score_vdata$AUC$score$se
),
nrow = 2, ncol = 6, byrow = T,
dimnames = list(
  c("C-index", "Time dependent AUC"),
  rep(c("Estimate", "Lower .95", "Upper .95"), 2)
)
)
res_discr_csh <- round(res_discr_csh, k)
kable(res_discr_csh) |>
  kable_styling("striped", position = "center") |>
  add_header_above(c(
    " " = 1,
    "Development data" = 3,
    "Validation data" = 3
  ))
```

The time-dependent AUC at 5 years was 0.71 in the validation set.

#### 2.2.2 Plot Area under the curve(s) over the time
We plot the time-dependent AUCs over the follow-up time using development and validation data. 

<details>
  <summary>Click to expand code</summary>
```{r, plot_AUCs, warning=FALSE,echo=TRUE, eval=FALSE}

# Models --------------
fit_csh <- CSC(Hist(time, status_num) ~
age + size +
  ncat + hr_status,
data = rdata,
fitter = "cph"
)
primary_event <- 1 # Set to 2 if cause 2 was of interest

# AUCs development data
AUC_rdata <- Score(
  list("csh_development" = fit_csh),
  formula = Hist(time, status_num) ~ 1,
  cens.model = "km",
  data = rdata,
  conf.int = TRUE,
  times = seq(0.4, 5.1, 0.2),
  metrics = c("auc"),
  cause = primary_event
)

# AUCs validation data
AUC_vdata <- Score(
  list("csh_validation" = fit_csh),
  formula = Hist(time, status_num) ~ 1,
  cens.model = "km",
  data = vdata,
  conf.int = TRUE,
  times = seq(0.5, 5.01, 0.1),
  metrics = c("auc"),
  cause = primary_event
)

# Plot
par(las = 1, xaxs = "i", yaxs = "i")
oldpar <- par(mfrow = c(1, 2))
plot(AUC_rdata$times,
  AUC_rdata$AUC$score$AUC,
  type = "l",
  bty = "n",
  xlim = c(0, 5),
  ylim = c(.5, 1),
  lwd = 2,
  xlab = "Time (years)",
  ylab = "AUC",
  lty = 1
)
polygon(c(
  AUC_rdata$times,
  rev(AUC_rdata$times)
),
c(
  AUC_rdata$AUC$score$lower,
  rev(AUC_rdata$AUC$score$upper)
),
col = rgb(160, 160, 160, maxColorValue = 255, alpha = 100),
border = FALSE
)
lines(AUC_rdata$times,
  AUC_rdata$AUC$score$AUC,
  col = "black",
  lwd = 2,
  lty = 2
)
title("Development data", adj = 0)

# Validation data
plot(AUC_vdata$times,
  AUC_vdata$AUC$score$AUC,
  type = "l",
  bty = "n",
  xlim = c(0, 5),
  ylim = c(.5, 1),
  lwd = 2,
  xlab = "Time (years)",
  ylab = "AUC",
  lty = 1
)
polygon(c(
  AUC_vdata$times,
  rev(AUC_vdata$times)
),
c(
  AUC_vdata$AUC$score$lower,
  rev(AUC_vdata$AUC$score$upper)
),
col = rgb(160, 160, 160, maxColorValue = 255, alpha = 100),
border = FALSE
)
lines(AUC_vdata$times,
  AUC_vdata$AUC$score$AUC,
  col = "black",
  lwd = 2,
  lty = 2
)
title("Validation data", adj = 0)
```
</details>

```{r, plot_AUCs, warning=FALSE,echo=FALSE, eval=TRUE}
```

#### 2.2.3 Royston-Sauerbrei D statistic and R<sup>2</sup><sub>D</sub>

We calculate Royston-Sauerbrei D statistic and R<sup>2</sup><sub>D</sub> as a measure of explained variation

<details>
  <summary>Click to expand code</summary>
```{r, royston_R2D, fig.align='center', message=FALSE, warning=FALSE,echo=TRUE}
# function to calculate D and R2D
source(here::here("R/roystonD.R"))

# Models ----------
fit_csh <- CSC(Hist(time, status_num) ~
age + size +
  ncat + hr_status,
data = rdata,
fitter = "cph"
)


# useful objects
primary_event <- 1 # Set to 2 if cause 2 was of interest
horizon <- 5 # Set time horizon for prediction (here 5 years)

# Predicted risk estimation
rdata$pred <- predictRisk(fit_csh,
  cause = primary_event,
  times = horizon,
  newdata = rdata
)

vdata$pred <- predictRisk(fit_csh,
  cause = primary_event,
  times = horizon,
  newdata = vdata
)

# Rank based on predicted values at 5 years

rdata$rank <- rank(rdata$pred,
  na.last = TRUE,
  ties.method = "first"
)

vdata$rank <- rank(vdata$pred,
  na.last = TRUE,
  ties.method = "first"
)


res_rdata <- royston_R2D_cmprsk(
  pred = rdata$pred,
  data = rdata,
  primary_event = 1,
  time = rdata$time,
  status = rdata$status_num
)

res_vdata <- royston_R2D_cmprsk(
  pred = vdata$pred,
  data = vdata,
  primary_event = 1,
  time = vdata$time,
  status = vdata$status_num
)
```
</details>

```{r, royston_R2D, warning=FALSE,echo=FALSE, eval=TRUE}
```

```{r, res_D_R2D, fig.align='center',echo=FALSE}
k <- 2
res_D <- rbind(
  "Development data" = res_rdata,
  "Validation data" = res_vdata
)

res_D <- round(res_D, k)
kable(res_D) |>
  kable_styling("striped", position = "center")
```

The R<sup>2</sup><sub>D</sub> showed how much of the observed variation in the outcome is explained by the prognostic model. The model explained 15% and 26% of the observed variation in the development and validation data, respectively.


### 2.3 Overall prediction error

We calculate the Brier Score, and the scaled Brier scale (also indicated by scaled Brier score or IPA) and the corresponding confidence intervals.

Some confidence intervals are calculated using the bootstrap percentile method.
```{r, bootstrap,warning=FALSE}
# Bootstrapping data
set.seed(20201214)
B <- 10 # number of bootstrap samples
rboot <- bootstraps(rdata, times = B)
vboot <- bootstraps(vdata, times = B)
```

<details>
  <summary>Click to expand code</summary>
```{r, overall, warning=FALSE,  fig.align='center', message=FALSE, echo=TRUE}

# Models -------------------
fit_csh <- CSC(Hist(time, status_num) ~
age + size +
  ncat + hr_status,
data = rdata,
fitter = "cph"
)
fit_csc1 <- fit_csh$models$`Cause 1`
fit_csc2 <- fit_csh$models$`Cause 2`

# Overall performance measures ----------------
primary_event <- 1 # Set to 2 if cause 2 was of interest
horizon <- 5 # Set time horizon for prediction (here 5 years)

# Development data
score_rdata <- Score(
  list("csh_development" = fit_csh),
  formula = Hist(time, status_num) ~ 1,
  cens.model = "km",
  data = rdata,
  conf.int = TRUE,
  times = horizon,
  metrics = c("auc", "brier"),
  summary = c("ipa"),
  cause = primary_event,
  plots = "calibration"
)

# Validation data
score_vdata <- Score(
  list("csh_validation" = fit_csh),
  formula = Hist(time, status_num) ~ 1,
  cens.model = "km",
  data = vdata,
  conf.int = TRUE,
  times = horizon,
  metrics = c("auc", "brier"),
  summary = c("ipa"),
  cause = primary_event,
  plots = "calibration"
)

# Bootstrap ------
# Functions to expand data and calculate Brier, IPA and AUC in bootstrap
# samples.
# For Brier and AUC, bootstrap should be computationally faster when
# data has more than 2000 rows (see ?riskRegression::Score).
# Our data has 1000 row so we will need only bootstrap to calculate
# confidence intervals of the scaled Brier (IPA) since
# it is not provided by riskRegression::Score() function.


# Score functions in any bootstrap data
score_boot <- function(split) {
  Score(
    list("csh_validation" = fit_csh),
    formula = Hist(time, status_num) ~ 1,
    cens.model = "km",
    data = analysis(split),
    conf.int = TRUE,
    times = horizon,
    metrics = c("auc", "brier"),
    summary = c("ipa"),
    cause = primary_event,
    plots = "calibration"
  )
}

# Development data
rboot <- rboot |> mutate(
  score = map(splits, score_boot),
  scaled_brier = map_dbl(score, function(x) {
    x$Brier$score[model == "csh_validation"]$IPA
  })
)
# Validation data
vboot <- vboot |> mutate(
  score = map(splits, score_boot),
  scaled_brier = map_dbl(score, function(x) {
    x$Brier$score[model == "csh_validation"]$IPA
  })
)
```
</details>

```{r, overall, warning=FALSE,echo=FALSE, eval=TRUE}
```

```{r, res_ov, fig.align='center',echo=FALSE}
# Table overall measures
alpha <- .05
k <- 2 # number of digits
res_ov_csh <- matrix(unlist(c(
  # Brier score - apparent validation
  score_rdata$Brier$score[model == "csh_development"]$Brier,
  score_rdata$Brier$score[model == "csh_development"]$lower,
  score_rdata$Brier$score[model == "csh_development"]$upper,

  # Brier score - external validation
  score_vdata$Brier$score[model == "csh_validation"]$Brier,
  score_vdata$Brier$score[model == "csh_validation"]$lower,
  score_vdata$Brier$score[model == "csh_validation"]$upper,

  # IPA - apparent validation
  score_rdata$Brier$score[model == "csh_development"]$IPA,
  quantile(rboot$scaled_brier, probs = alpha / 2),
  quantile(rboot$scaled_brier, probs = 1 - alpha / 2),

  # IPA - external validation
  score_vdata$Brier$score[model == "csh_validation"]$IPA,
  quantile(vboot$scaled_brier, probs = alpha / 2),
  quantile(vboot$scaled_brier, probs = 1 - alpha / 2)
)),
nrow = 2, ncol = 6,
byrow = T,
dimnames =
  list(
    c("Brier", "scaled Brier"),
    rep(c("Estimate", "Lower .95 ", "Upper .95"), 2)
  )
)
res_ov <- round(res_ov_csh, 2) # Digits
kable(res_ov) |>
  kable_styling("striped", position = "center") |>
  add_header_above(c(
    " " = 1,
    "Development data" = 3,
    "Validation data" = 3
  ))
```

Note: unexpectedly, the point estimate for the Brier score is lower (thus better) and for the scaled Brier score is higher (thus better) in the validation data compared to the development data.

## Goal 3 -  Clinical utility

Clinical utility can be measured by the net benefit and plotted in a decision curve. Details about net benefit, decision curve calculation and interpretation are provided in the manuscript (see also the appendix) and its references.

<details>
  <summary>Click to expand code</summary>
```{r, dca, message=FALSE,warning=FALSE, fig.align='center',echo=TRUE,eval=FALSE}
# Run the stdca function to calculate the net benefit and the elements needed to develop decision curve analysis
source(here::here("R/stdca.R"))

# Models ------------------------------
fit_csh <- CSC(Hist(time, status_num) ~
age + size +
  ncat + hr_status,
data = rdata,
fitter = "cph"
)

# useful objects
primary_event <- 1 # Set to 2 if cause 2 was of interest
horizon <- 5 # Set time horizon for prediction (here 5 years)

# Development data
# calculation estimated risk
rdata$pred5 <- predictRisk(fit_csh,
  cause = primary_event,
  newdata = rdata,
  times = horizon
)
rdata <- as.data.frame(rdata)
dca_rdata <- stdca(
  data = rdata,
  outcome = "status_num",
  ttoutcome = "time",
  timepoint = horizon,
  predictors = "pred5",
  xstop = 0.35,
  ymin = -0.01,
  graph = FALSE,
  cmprsk = TRUE
)
# Decision curves plot
oldpar <- par(
  xaxs = "i",
  yaxs = "i",
  las = 1,
  mar = c(6.1, 5.8, 4.1, 2.1),
  mgp = c(4.25, 1, 0)
)
plot(dca_rdata$net.benefit$threshold,
  dca_rdata$net.benefit$pred5,
  type = "l",
  lwd = 2,
  lty = 1,
  xlab = "",
  ylab = "Net Benefit",
  xlim = c(0, 0.5),
  ylim = c(-0.10, 0.10),
  bty = "n",
  xaxt = "n"
)
legend("topright",
  c("Treat all", "Treat none", "Prediction model"),
  lwd = c(2, 2, 2),
  lty = c(1, 2, 1),
  col = c("darkgray", "black", "black"),
  bty = "n"
)
lines(dca_rdata$net.benefit$threshold,
  dca_rdata$net.benefit$none,
  type = "l",
  lwd = 2,
  lty = 4
)
lines(dca_rdata$net.benefit$threshold,
  dca_rdata$net.benefit$all,
  type = "l",
  lwd = 2,
  col = "darkgray"
)
axis(1,
  at = c(0, 0.1, 0.2, 0.3, 0.4, 0.5)
)
axis(1,
  pos = -0.145,
  at = c(0.1, 0.2, 0.3, 0.4, 0.5),
  labels = c("1:9", "1:4", "3:7", "2:3", "1:1")
)
mtext("Threshold probability", 1, line = 2)
mtext("Harm to benefit ratio", 1, line = 5)
title("Development data")
par(oldpar)


# Validation data
# Predicted probability calculation
vdata$pred5 <- predictRisk(fit_csh,
  cause = primary_event,
  newdata = vdata,
  times = horizon
)
vdata <- as.data.frame(vdata)
# Run decision curve analysis
# Development data
# Model without PGR
dca_vdata <- stdca(
  data = vdata,
  outcome = "status_num",
  ttoutcome = "time",
  timepoint = 5,
  predictors = "pred5",
  xstop = 0.45,
  ymin = -0.01,
  graph = FALSE,
  cmprsk = TRUE
)
# Decision curves plot
oldpar <- par(
  xaxs = "i",
  yaxs = "i",
  las = 1,
  mar = c(6.1, 5.8, 4.1, 2.1),
  mgp = c(4.25, 1, 0)
)
plot(dca_vdata$net.benefit$threshold,
  dca_vdata$net.benefit$pred5,
  type = "l",
  lwd = 2,
  lty = 1,
  xlab = "",
  ylab = "Net Benefit",
  xlim = c(0, 0.5),
  ylim = c(-0.10, 0.10),
  bty = "n",
  xaxt = "n"
)
lines(dca_vdata$net.benefit$threshold,
  dca_vdata$net.benefit$none,
  type = "l",
  lwd = 2,
  lty = 4
)
lines(dca_vdata$net.benefit$threshold,
  dca_vdata$net.benefit$all,
  type = "l",
  lwd = 2,
  col = "darkgray"
)
legend("topright",
  c("Treat all", "Treat none", "Prediction model"),
  lwd = c(2, 2, 2),
  lty = c(1, 2, 1),
  col = c("darkgray", "black", "black"),
  bty = "n"
)
axis(1,
  at = c(0, 0.1, 0.2, 0.3, 0.4, 0.5)
)
axis(1,
  pos = -0.145,
  at = c(0.1, 0.2, 0.3, 0.4, 0.5),
  labels = c("1:9", "1:4", "3:7", "2:3", "1:1")
)
mtext("Threshold probability", 1, line = 2)
mtext("Harm to benefit ratio", 1, line = 5)
title("Validation data")
par(oldpar)
```
</details>

```{r, dca, message=FALSE,warning=FALSE, fig.align='center',echo=FALSE,eval=TRUE}
```

If we choose a threshold of 20\%, the model had a net benefit of 0.011 in the development data. In the validation data, the model had a net benefit of 0.014 choosing a threshold of 20\%. 

## Reproducibility ticket

```{r repro_ticket, echo=TRUE}
sessionInfo()
```