Skip to content

This a repository containing the code for the paper "Towards Safe Policy Improvement forNon-Stationary MDPs"

Notifications You must be signed in to change notification settings

ScottJordan/SafePolicyImprovementNonstationary

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SafePolicyImprovementNonstationary

This a repository containing the code for the paper "Towards Safe Policy Improvement forNon-Stationary MDPs"

Yash Chandak, Scott M. Jordan, Georgios Theocharous, Martha White, Philip S. Thomas, Towards Safe Policy Improvement for Non-Stationary MDPs (NeurIPS, 2020) [NeurIPS] [ArXiv] [Code]

For dependencies, see the file Project.toml.

Setup Conda environment for Julia

ENV["PYTHON"] = "/path/to/miniconda3/envs/<env_name>/bin/python"
] build PyCall
build IJulia

add https://github.com/ScottJordan/EvaluationOfRLAlgs.git 

Install Glucose Simulator

conda activate <env_name> #Activate the virtual specified above. 
cd python/SimGlucose
pip install -e .

If you get MKL errors when trying to use the simulator from Julia. Uninstall the conda numpy library that has MKL (should be the default) and then add the one without MKL.

To reproduce results in the paper run the files experiments/bandit_swarm.jl and experiments/glucose_swarm.jl.

The jupyter notebook experiments/plots.ipynb contains code for plotting and analyzing the results.

About

This a repository containing the code for the paper "Towards Safe Policy Improvement forNon-Stationary MDPs"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published