Skip to content

A Variational Autoencoder Approach to Conditional Generation of Possible Future Volatility Surfaces

License

Notifications You must be signed in to change notification settings

rotmanfinhub/vol-surface-vae-pub

Repository files navigation

Volatility Surface VAE

This code base is for the following paper:

Jacky Chen, John Hull, Zissis Poulos, Haris Rasul, Andreas Veneris, Yuntao Wu, "A Variational Autoencoder Approach to Conditional Generation of Possible Future Volatility Surfaces", to appear in The Journal of Financial Data Science, 2025.

We use CVAE + LSTM to generate volatility surfaces based on an arbitrary context length.

data_preproc

This folder contains all the code for data cleaning/generation.

  • data_preproc.py:
    • cleaning up the data downloaded from WRDS
    • generating 5x5 volatility surface grid (moneyness x time to maturity)
    • The usage can be found in spx_volsurface_generation.ipynb and spx_convert_to_grid.ipynb.
    • Note: We might need to download the S&P500 stock prices from yahoo finance, using ticker ^GSPC.
  • sabr_gen.py:
    • For SABR volatility surface grid (K/S moneyness x time to maturity) generation (Appendix C)
    • The usage can be found in sabr_volsurface.ipynb

To preprocess the data, use the following two files in the main directory:

  • spx_volsurface_generation.ipynb: This uses the data_preproc.py, cleans the data and generates a dataframe containing the interpolated IVS data.
  • spx_convert_to_grid.ipynb: This converts the dataframe generated by spx_volsurface_generation.ipynb to create the 5x5 numpy grid.

eval_scripts

This folder contains code for generating distributions of surfaces for single day and multiple days and relevant evaluation functions, such as histogram plotting and latent manipulation.

vae

This folder contains all the code for VAE definitions.

  • base: the base VAE, encoder and decoder classes
  • dense_vae: VAE that flattens the input and treat everything as 1D vector
  • conv_vae: VAE that uses 2D convolutional layers for encoder and decoder
  • cvae: conditional VAE that uses Conv2D/Linear layers for encoder and decoder
  • cvae_with_mem: cvae but with memory added, can use LSTM, GRU, RNN. Default LSTM.
  • cvae_with_mem_randomized: same as cvae_with_mem, but with variable context length and generate 1 day forward. Used in the current paper.

Other codes:

  • datasets: the customized dataset definitions used for the models. The classes with __getitem__ returning a dictionary is currently used.
  • datasets_randomized: the customized dataset definitions used for cvae_with_mem_randomized, generates data points with variable context length.
  • utils: code used for setting random seeds, training, testing and evaluation.

Training and Surface generation

  • param_search.py can be used to search for optimal parameters and train the models
  • generate_surfaces.py can be used to generate distributions of surfaces over a time horizon
  • generate_surfaces_max_likelihood.py can be used to generate the surfaces with maximum likelihood (encoded latent with zero for generated date)

Table/Plot generation

The following files contains the code for table and plot generation for the final paper:

  • main_analysis.py

Detailed implementations are in analysis_code.

S&P 500 data:

The S&P 500 Option price data is downloaded from WRDS Get Data. OptionMetrics/Ivy DB US/Options/Option Prices.
Step 1:

  • Date Range: 2000-01-01 to 2023-02-28

Step 2:

  • SECID = 108105
  • Option Type: Both
  • Exercise Type: Both
  • Security Type: Both

Step 3:

  • Query Variables: all

Step 4:

  • Output Format: *.csv
  • Compression: *.zip
  • Date Format: YYYY-MM-DD

Models and parsed data can be downloaded from Google Drive

About

A Variational Autoencoder Approach to Conditional Generation of Possible Future Volatility Surfaces

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published