Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization

This is the code for reproducing the results of the paper Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization accepted as Notable-top-5% at ICLR'2023.

The discrete version of IVR on Atari datasets can be found at https://github.com/ryanxhr/Discrete_IVR.

Usage

Our code is built on the jax version code of IQL (https://github.com/ikostrikov/implicit_q_learning). Paper reuslts can be reproduced by running ./run_mujoco.sh, ./run_antmaze.sh and ./run_kitchen.sh.

Bibtex

@inproceedings{xu2023offline,
  title  = {Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization},
  author = {Haoran Xu, Li Jiang, Jianxiong Li, Zhuoran Yang, Zhaoran Wang, Victor Wai Kin Chan, Xianyuan Zhan},
  year   = {2023},
  booktitle = {International Conference on Learning Representations},
}

Name	Name	Last commit message	Last commit date
Latest commit xuhaoran4 revise readme Jul 27, 2023 58edd26 · Jul 27, 2023 History 26 Commits
configs	configs	Delete configs/__pycache__ directory	Mar 23, 2023
wrappers	wrappers	Delete wrappers/__pycache__ directory	Mar 23, 2023
.gitignore	.gitignore	update	Mar 8, 2023
LICENSE	LICENSE	back	Mar 14, 2023
README.md	README.md	revise readme	Jul 27, 2023
actor.py	actor.py	update	Mar 25, 2023
common.py	common.py	back	Mar 14, 2023
critic.py	critic.py	update	Mar 25, 2023
dataset_utils.py	dataset_utils.py	update	Mar 8, 2023
evaluation.py	evaluation.py	update	Mar 25, 2023
learner.py	learner.py	update	Mar 22, 2023
policy.py	policy.py	update	Mar 8, 2023
requirements.txt	requirements.txt	update	Mar 22, 2023
run_antmaze.sh	run_antmaze.sh	update	Mar 25, 2023
run_kitchen.sh	run_kitchen.sh	update	Mar 25, 2023
run_mujoco.sh	run_mujoco.sh	update	Mar 25, 2023
train_finetune.py	train_finetune.py	update	Mar 8, 2023
train_offline.py	train_offline.py	update	Mar 25, 2023
value_net.py	value_net.py	update	Mar 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization

Usage

Bibtex

About

Releases

Packages

Contributors 2

Languages

License

ryanxhr/IVR

Folders and files

Latest commit

History

Repository files navigation

Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization

Usage

Bibtex

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages