Skip to content

Latest commit

 

History

History
39 lines (29 loc) · 2.53 KB

README.md

File metadata and controls

39 lines (29 loc) · 2.53 KB

bandit-simulations

Simulation Code for Bandit Algorithms https://docs.google.com/presentation/d/1D2xYWAkfR0exT9pfThozlPL8S_zqgrKYEAVWA77Q_xA/edit?usp=sharing

MHA Project

TS-Contextual Bandit

TS-Contextual Bandit Algorithm in used for many bandit designs in Mental Health America (MHA) project.

Detailed code of TS-Contextual Bandit can be accessed through the following link here.

TS-Traditional Bandit

TS-Traditional Bandit Algorithm follows Beta-Bernoulli with Thompson Sampling method.

Detailed code of Traditional Bandit can be accessed through the following link here.

TS-PostDiff Bandit

TS-PostDiff Bandit Algorithm is similar to TS-Traditional but involves a threshold c to adjust the policy of the bandit algorithm. Similar to epsilon-greedy, it mixes Uniform Random and TS-Traditional policy in the algorithm.

Detailed code of Traditional Bandit can be accessed through the following link here.

Setup

Install packages

  • This code supports Python 3.9+.
  • pip install -r requirements.txt
  • Someitmes installation can occur via the wrong python version if pip is already associated with a python version:
    • if you run pip install and continue to run into 'package not found' issues try python3.9 -m pip install <PacakgeName>.
    • pip itself is written in python so you can choose the version of python that runs pip and for which packages are installed with the above command

How To Run?

Note: If you are running this code under Jupyter Notebook/Google Colab environments, you should include --notebook_mode=True to all following commands.

Running Simulations

To run simulations for different policy settings, run the following command from the root directory to this repository:

python main.py simulate --config_path=<path_to_your_configs_file> --output_path=<path_to_your_outputs> --checkpoint_path=<path_to_your_checkpoints>

This command will write simulation results and evaluation results under two directory to <path_to_your_outputs>.