Skip to content

Commit

Permalink
Updated readme
Browse files Browse the repository at this point in the history
  • Loading branch information
lukearcus committed Oct 17, 2022
1 parent cdbcdb2 commit 92ab3e8
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 2 deletions.
2 changes: 1 addition & 1 deletion FSP.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def gen_data(self, pi, beta, eta):
#import pdb; pdb.set_trace()
sigma = []
for p in range(self.num_players):
sigma.append((1-eta)*pi[p]+eta*beta[p])
sigma.append((1-eta)*pi[p]+eta*beta[p]) # this step might be wrong
D = [[] for i in range(self.num_players)]
for i in range(self.n):
res = self.play_game(sigma)
Expand Down
24 changes: 24 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1 +1,25 @@
# General Sum Off-Belief Learning

run main.py for rl, obl or ot-rl. Run main\_FSP.py for fictitious self play, the following options can be used for either (although some will not have any effect on FSP).

options:
**--lvls** LEVELS
Select number of OBL/OT_RL levels to run through, defaults to 10.
**--game** kuhn/leduc
Choose either kuhn poker or leduc hold 'em.
**-ab, --avg_bel**
Generate an averaged belief (over levels), and use this in OBL.
**-ap, --avg_pol**
Generate the averaged policy across levels and use this when evaluating.
**-al, --avg_learn**
When carrying out OBL, use the opponent's averaged policy to find their action.
**-a, --all_avg**
Averaged belief, policy and learning.
**--debug**
Prints out debugging information.
**-v**
Prints out some information about progress.
**--obl**
Uses OBL to learn.
**--ot_rl**
Uses OT-RL to learn, updates lower level policies based on the distribution induced in higher levels.
2 changes: 1 addition & 1 deletion main.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import logging
from multiprocessing import Pool
log = logging.getLogger(__name__)
NUM_LOOPS=10
NUM_LOOPS=1

def run(options, games_per_lvl=100000, exploit_freq= 1):
num_lvls = options["num_lvls"]
Expand Down

0 comments on commit 92ab3e8

Please sign in to comment.