This repository contains code and a synthetic data set for reproducing the modeling framework in Vana and Hornik (2020).
The synthetic data set synthetic_dat.rda
has been generated using
the synthpop R package (Nowok et. al, 2016) by replacing all observations
with values simulated from probability distributions specified to
preserve key features of the actual observed data.
The data set contains 19952 rows and 13 columns:
-
firm_id
: integer containing the firm id, maximum is 2528 -
year_id
: integer containing the year id, maximum is 19 -
R1
: integer corresponding to the rating classes assigned by rater 1 -
R2
: integer corresponding to the rating classes assigned by rater 2 -
R3
: integer corresponding to the rating classes assigned by rater 3 -
D
: binary indicator for default -
X1
toX7
: standardized covariates.
The Stan codes for the five models introduced in the papers can be found in the Stan
folder.
Model_S1_logit_priors-eps-N_bias-0.stan
Model_S2_logit_priors-eps-N_bias-diffgammas.stan
Model_D1_logit_priors-a-HN-b-AR1-eps-AR1_bias-0.stan
Model_D2_logit_priors-a-HN-b-AR1-eps-AR1_bias-diffgammas.stan
Model_PM_logit_priors-a-HN-b-AR1-eps-AR1_bias-diffbetas-delta-AR1.stan
In the folder Simulation
, the Simulation.Rmd
file can be used to reproduce the analysis from the online appendix of the paper.
In the folder Synthetic_Data
, the Synthetic_Data_Analysis.Rmd
contains the code for reproducing the analysis in that it performs the out of sample analysis by repeatedly estimating the five models presented in the paper (PM
, S1
, S2
, D1
, D2
) using the RStan package (Stan Development Team, 2020) for different training vs. test samples from synthetic_dat.rda
. In the one-step-ahead prediction exercise we train the model on data containing years 1 to t and then evaluate the log predictive likelihoods for the following year t + 1. We illustrate the approach for
The code creates a folder Results_Synthetic_Data
which contains resulting .rda
files for each test period. Code for computing the out-of-time measures is provided.
Beata Nowok, Gillian M. Raab, Chris Dibben (2016). synthpop: Bespoke Creation of Synthetic Data in R. Journal of Statistical Software, 74(11), 1-26. doi:10.18637/jss.v074.i11
Stan Development Team (2020). RStan: the R interface to Stan. R package version 2.19.3. http://mc-stan.org/.