GitHub - raoariel/finite-memory-automaton: A finite memory automaton for static and dynamic two-armed Bernoulli bandit problems

A Finite Memory Automaton for Static and Dynamic Two-Armed Bernoulli Bandit Problems

Abstract

The multi-armed bandit problem is characterized by a repeated decision of selecting an arm amongst a set of arms with unknown payoff probabilities; the payoff probabilities may remain constant or change over time. An agent gains information about the payoff probabilities from past actions and aims to learn an optimal arm selection strategy for maximal payoff. The two-armed Bernoulli bandit (TABB) problem is a special class of multi-armed bandit problems where there are exactly two arms with payoff structures that follow Bernoulli distributions. Existing approaches to the TABB primarily rely on perfect recall of past actions to generate estimates for arm payoff probabilities; it is further assumed that the decision maker knows a priori whether arm payoff probabilities can change. We present a different approach based on finite automata which demonstrates that an agent can learn a low regret strategy without knowing whether arm payoff probabilities are static or dynamic and without having perfect recall of past actions. Roughly speaking, the automaton works by maintaining a relative ranking of arms based on payoff probabilities rather than estimating precise payoff probabilities.

Full paper available upon request. Contact arielrao[at]cmu.edu.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Finite Memory Automaton for Static and Dynamic Two-Armed Bernoulli Bandit Problems

Abstract

About

Releases

Packages

raoariel/finite-memory-automaton

Folders and files

Latest commit

History

Repository files navigation

A Finite Memory Automaton for Static and Dynamic Two-Armed Bernoulli Bandit Problems

Abstract

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages