Skip to content
/ bitvi Public

Repository for the paper "Approximate Bayesian Inference via Bitstring Representations" published at UAI2025

License

Notifications You must be signed in to change notification settings

AaltoML/bitvi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Official code repository for the paper: Approximate Bayesian Inference via Bitstring Representations

This repository is the official implementation of the methods in the publication:

  • Aleksanteri Sladek, Martin Trapp and Arno Solin (2025). Approximate Bayesian Inference via Bitstring Representations. In Proceedings of the 41st Conference on Uncertainty in Artificial Intelligence (UAI). [OpenReview] [Proceedings] [arXiv]

        demo.png

Installing Dependencies

This codebase is primarily written in Python 3. Dependencies can be installed into a new environment via the conda command line tool and the environment.yml file provided, which lists the Python version used and all the required Python packages. A new environment can be created via the following command:

conda env create -f environment.yml

This command will create a new environment called 'bitvi' with the appropriate Python version and the packages needed to run the code. It can be activated via:

conda activate bitvi

For further information, see the conda documentation.

Additionally, elements of this codebase requires installing the bayesianize Python library. Clone the repository from https://github.com/microsoft/bayesianize into a folder within the same directory that the bitvi codebase is located, e.g:

GitHub
  |
  |-> bitvi
  |     |-> src
  |     |-> scripts 
  .     .
  .     .
  .     .
  |-> bayesianize
  |     | -> bnn
  .     .
  .     .
  .     .

Running the code on a GPU

The codebase is based on the PyTorch Python package, meaning running the code on GPU to speed up execution can be done easily. This may require changing the environment.yml file to have conda install the pytorch-gpu library (instead of just pytorch, as this may install the CPU-only version of the library) and updating the conda environment.

Interactive example

A good place to start with the codebase is a marimo notebook with an interactive example of Figure 1 we've provided. This example shows how to initialize a 1D BitVI model, and approximate a mixture density with it. Furthermore it visualizes how the approximation quality changes as the number of bits used for the approximation are changed. This notebook can be found in notebooks/bitvi_1d_example.py.

Running experiments in the paper

Experiments in the paper consist of training the BitVI model on 1D densities, 2D densities and a Bayesian NN on the Moons and Banana classification datasets, and on the UCI benchmark. The sections below give details on how to reproduce these experiments.

Learning a 1D density

The main entry point for approximating a known and possibly unnormalized 1D density is the Python script scripts/fit_density_1d.py. An example of running this script to reproduce the experiment shown in Figure 1 of the paper is provided with the bash script scripts/figure_1.sh. In the repository directory, run:

./scripts/figure_1.sh

Results will be stored in the experiments folder, under a sub-folder with the bash script's name.

Learning a 2D density

The main entry point for approximating a known and possibly unnormalized 2D density is the Python script scripts/fit_density_2d.py. An example of running this script to reproduce the experiment shown in Figure 2 of the paper is provided with the bash script scripts/figure_2.sh. In the repository directory, run:

./scripts/figure_2.sh

Running this script will run 4-bit BitVI for a Gaussian mixture, Neal's funnel, Multi-modal Gaussian, Ring and Banana target densities. Results will be stored in the experiments folder, under a sub-folder with the bash script's name.

Another example, which also runs Gaussian Full-Covariance VI on the above densities, is given by scripts/figure_11.sh which reproduces the results shown in Figure 11.

./scripts/figure_11.sh

Approximating a Bayesian Neural Network posterior

The next set of experiments in the paper involved approximating the posterior distribution of a Bayesian Neural Network trained on the Moons and Banana classification tasks, as well as a selection of UCI datasets via the Bayesian Benchmarks library. The following sections detail how these codes can be run.

Moons and Banana experiments

The experiment illustrated in Figure 6 consists of a BNN trained on the moons dataset with the posterior distribution approximated via BitVI and Fully-factorized Gaussian VI (FFGVI). For further comparison, a standard NN is also trained. These experiments can be recreated by running the script:

./scripts/figure_6.sh

which will train a BNN with a 2-bit, 4-bit and 8-bit BitVI variational family. It will also run the script for training the FFGVI version, and the regular NN. These functions are performed by the Python scripts train_bnn_bitvi.py, train_bnn_ffgvi.py and train_nn.py scripts in the scripts folder.

The 'chopping the banana' experiment illustrated in Figure 8 can be recreated with the scripts/chop_bits.py Python script. Running this script requires having previously trained a sufficiently high bit count BitVI model and giving the directory containing this model as the input, since the script will reduce the bits used in the BitVI model iteratively. Running the script scripts/figure_8.sh will train a BNN on the banana dataset with the same hyperparameters as was used for creating Figure 8. To run it, execute:

./scripts/figure_8.sh

This will create a directory in the experiments folder experiments/figure_8/<experiment_name_here>. Pass this directory as an argument to chop_bits.py:

python scripts/chop_bits.py --model_dir experiments/figure_8/<experiment_name_here>

This will create a new directory in experiments experiments/chop_bits/<experiment_name_here> with the decision boundary visualized for different numbers of bits.

Entropy experiment

The experiment illustrated in Figure 9 of the paper can be recreated by running:

./scripts/figure_9.sh

This bash script will execute the file scripts/smoothness_figure.py, which trains and visualizes a 16-bit BitVI model for an increasingly complex density function. The script shows the training progress in an interactive window, visualizing the true and approximate density as the top figure and the circuit entropy as a function of the circuit depth (i.e, the number of bits used) in the lower figure.

Bayesian Benchmarks experiment

The Bayesian Benchmarks experiment is illustrated in Table 1 of the paper. The experiment consists of training 2-bit, 4-bit, 8-bit BitVI, Full-covariance Gaussian VI and Fully-factorized Gaussian VI models on several datasets. The hyperparameters for each dataset and model combination are defined in the JSON files stored in src/uci_experiment/econfigs. These are aggregated into a text file containing list of commands (that can be run on a cluster via SLURM for example) via the scripts/make_grid.py script. You can run this script to recreate the commands for the experiments conducted for Table 1 by running:

./scripts/table_1.sh

The text files will be stored in src/uci_experiment/econfigs. The commands in these text files can be executed on a cluster via the provided SLURM script in scripts/slurm via (for example):

scripts/slurm/launch.sh src/uci_experiment/econfigs/uci_bitvi_8bit.txt

or the commands can be individually via another script you make. Note that you will most likely need to modify the SLURM scripts for your cluster. Each command will run a cross-validated experiment on 5-folds.

Once all the commands generated are run, the results are aggregated using the script src/uci_experiment/process_results.py, which generates a file aggregated_result.json. This is then processed into a tex table via src/uci_experiment/results_to_latex.py. Note that this script bolds results in the table using a T-test.

Codebase structure

The codebase is roughly organized as follows. The src folder contains reusable code, such as class and function definitions. The scripts folder contains code that utilizes code from src to perform various tasks, such as the experiment outlined in the section above. The data folder contains data required for running the experiments. Finally, an experiments folder is created by many of the scripts for storing experiment results.

Noteworthy pieces of code

The main contribution of this paper, the deterministic probabilistic circuit (PC) forming the variational family in BitVI, is contained within the file src/density_models/circuits.py. This file contains three classes; TreeCircuit1D, TreeCircuitND and ParallelTreeCircuit.

  • The class TreeCircuit1D represents a binary tree structured deterministic PC over 1 random variable.
  • The class TreeCircuitND represents a binary tree structured deterministic PC over N random variables. This structure iteratively cycles through each dimension's bits, and constructs a binary tree of depth N * num_bits. Hence, it does not scale well beyond few dimensions and bits.
  • Finally, ParallelTreeCircuit is a parallelization of K instances of a TreeCircuit1D. It is intended for performing mean-field VI for the parameters of a BNN, where ParallelTreeCircuit represents the fully-factorized joint distribution over the parameters of a BNN's weight matrix for a single layer.

Acknowledgements

  • This codebase relies on the bayesianize library by Microsoft for performing mean-field Gaussian VI on Bayesian Neural Networks.

  • This codebase utilizes code from the Bayesian Benchmarks library by Hugh Salimbeni et al. for evaluating our method on various UCI datasets.

  • This codebase uses several of the datasets provided in the UCI Machine Learning Repository via the Bayesian Benchmarks library.

  • This codebase utilizes code from the improved-hyperparameter-learning library by Rui Li et at. for running the Bayesian Benchmarks experiments.

  • This codebase utilizes code from the squared-npcs library by Lorenzo Loconte et al. for running grid experiments.

Citation

If you want to cite the paper, you can use the following bibtex entry:

@InProceedings{sladek2025approximate,
  title = 	 {Approximate Bayesian Inference via Bitstring Representations},
  author =       {Sladek, Aleksanteri and Trapp, Martin and Solin, Arno},
  booktitle = 	 {Proceedings of the 41st Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {3939--3957},
  year = 	 {2025},
  editor = 	 {Chiappa, Silvia and Magliacane, Sara},
  volume = 	 {286},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--25 Jul},
  publisher =    {PMLR}
}

License

This software is provided under the MIT license, unless otherwise stated. Portions of the software are under the Apache 2.0 License and GPL 3.0 License.

About

Repository for the paper "Approximate Bayesian Inference via Bitstring Representations" published at UAI2025

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published