Appears in AISTATS 2024
We investigate the impact of including distributional dependence in diffusion models for time series analysis and generative modeling. The main inspiration is the transformer architecture which includes dependence on distributions of tokens. We consider generalizations of this architecture described by McKean-Vlasov processes. In this work, we propose a series of neural network architectures for parameterizing these stochastic processes and investigate the benefits such parameterizations bring in relevant machine learning tasks.
The supplementary material contains the code and the yaml files to replicate the results in the paper. The main implementation of our proposed structure is in The parameter estimation for both real and synthetic datasets, as well as the generative modeling experiments can be run using the command:
python -f yaml_filepath -d device -e experiment_folder
We also provide our implementation of deepAR in, which serves as benchmarks in real datasets. The deepARs results can be replicated through the command:
python -f yaml_filepath -d device -e experiment_folder
The generative experiments using the linear Fokker Planck Equation can be replicated through the command:
python -f yaml_filepath -d device -e experiment_folder
Finally, the
is a script to analyze our result on Synthetic and real TS data ( and generative experiments (
The experiments are structured as follows:
- config (All yaml files)
- add_noise (Synthetic Dataset)
- OU_jump (OU process with jumps)
- EEG_by_electrode/EEG_by_electrode_a (non-alcoholics/alcoholics EEG experiments for MLP and MeanFieldMLP)
- subject1-5
- EEG_by_deepAR/EEG_by_deepAR_a (non-alcoholics/alcoholics EEG experiments for deepAR with different backbones)
- subjecta1-5
- Chemotaxi (chemotaxi experiments for MLP and MeanFieldMLP)
- Chemotaxi_deepAR (chemotaxi experiments for deepAR)
- generative_120_10_bridge_eightgauss (generative synthetic eight Gaussian experiments with 120 for 15particles * 8 gaussian, and 10 bridges)
- 2,10,30,50,100d
- generative_100_30_bridge_realGen1_tanh (real data generative experiments with 100 particles and 30 bridges, tanh activation).
- power, gas, cortex, hepmass, miniboone
- generative_realGen_others (real data generative experiments with other baselines includeing MAF, WGAN, VAE and Score-Based SDE).
- score_based_sde includes
- appendix_* (experiments in appendix)
- config (All yaml files)
Real Data source (and preprocessing pipeline):
The two real TS data can be obtained at the following link:
Power, Gas, Miniboone, Hepmass preprocessing pipelines and data are obtained from:
Cortex data is obtained from:
Our GLOW implementation is adapted from
MAF implementation is adapted from
WGAN implementation is adapted from
VAE implementation is adapted from
Score-Based SDE implementation is adapted from the official tutorial:
If the code was helpful, please use the following citation:
title={Neural McKean-Vlasov Processes: Distributional Dependence in Diffusion Processes},
author={Yang, Haoming and Hasan, Ali and Ng, Yuting and Tarokh, Vahid},
booktitle={International Conference on Artificial Intelligence and Statistics},