Skip to content

Bandits

Joseph Lee edited this page Nov 11, 2024 · 4 revisions

Multi-armed Bandits

Epsilon

Figure 2.2

python run.py -m run.steps=1000 run.n_runs=2000 +bandit.epsilon=0,0.01,0.1 +bandit.random_argmax=true experiment.tag=fig2.2 experiment.upload=true

Figure 2.3: wandb: https://api.wandb.ai/links/josephjnl/53gxgbcc

python run.py -m run.steps=1000 run.n_runs=2000 +bandit.epsilon=0.1 +bandit.random_argmax=true bandit.alpha=0.1 bandit.Q_init=0 experiment.tag=fig2.3 experiment.upload=true
python run.py -m run.steps=1000 run.n_runs=2000 +bandit.epsilon=0 +bandit.random_argmax=true bandit.alpha=0.1 bandit.Q_init=5 experiment.tag=fig2.3 experiment.upload=true
Clone this wiki locally