About • Installation • Background • Features • Getting Started • Documentation • References • License
Imperfecto is a Python module of heavily commented implementations of algorithms and games with imperfect information such as Rock-Paper-Scissor and Poker. See Features for a list of games and algorithms we provide. See Installation for instruction on how to install this module, and Getting Started on usage and how to customize our module for your own game. For a complete documentation, see Documentation.
Clone this repo and install it as a module.
$ git clone https://github.com/vlongle/Imperfecto.git
$ cd imperfecto
$ pip3 install -e .
Try to import the module to check if the installation has been successful.
$ python3
>> import imperfecto
Games such as Chess are known as "perfect information" games as there is no private information to each player. Every player can observe each other's moves, and the game state (i.e., the chess board).
However, games such as Poker are "imperfect-information" as each player has their own private card. Games with simultaneously moves such as rock-paper-scissor can also be modeled as imperfect information as each player doesn't know the opponent's move ahead of time.
Games with imperfect information are typically modeled as extensive-form game (EFG). EFG models sequential games with a game tree. Players take turn making moves until a termination node is reached, where the payoff is revealed. Players make decision at each decision node. However, the players do not know exactly which node they are in. Instead, they only know that they're in a set of nodes known as an information set. An information set represents all the possible worlds that are consistent with what the player knows.
Games | Progress |
---|---|
Rock-paper-scissor | ✔️ |
Asymmetric Rock-paper-scissor | ✔️ |
Bar-crowding Game | ✔️ |
Prisoner's Dilemma | ✔️ |
Kuhn Poker | ✔️ |
Leduc Poker | |
Texas Hold' Em |
Algorithm | Progress |
---|---|
Regret Matching | ✔️ |
Counterfactual Regret Minimization (CFR) | ✔️ |
Monte Carlo CFR | |
Deep CFR | |
Bandit | |
Opponent modeling |
Look at all the demos that we have in imperfecto/demos
folder, and try any of them. For example, run (from the repo's root folder)
$ python3 imperfecto/demos/regret_matching_demo.py --help
to print out the available options for that demo file. Pick your desired options and run a demo file. For example,
$ python3 imperfecto/demos/regret_matching_demo.py --game PrisonerDilemmaGame
--train_regret_matching
Add a new game by writing an ACTION_ENUM
class that defines the possible action that is
available to each player. Then, write a Game class that inherits from our NormalFormGame
. You will
need to implement the get_payoffs
function.
from enum import intenum
from typing import sequence
from imperfecto.games.game import normalformgame
from imperfecto.utils import lessverboseenum
class MY_CUSTOM_ACTION(lessVerboseEnum, IntEnum):
...
class MyCustomNormalFormGame(NormalFormGame):
actions = MY_CUSTOM_ACTION
n_players = ...
def get_payoffs(self, history: Sequence[MY_CUSTOM_ACTION]) -> Sequence[float]:
...
See imperfecto/games/prisoner_dilemma.py
for an example. You can then use your custom
game with the functions available in imperfecto/demos/regret_matching_demo.py
. For example,
...
from imperfecto.demos import to_train_regret_matching
Game = MyCustomNormalFormGame
to_train_regret_matching(Game)
-
Implement other variations of counterfactual regret stuff like Monte Carlo, deep counterfactual
-
Implement some cool visualization of training like this (https://medium.com/hackernoon/artificial-intelligence-poker-and-regret-part-2-ee2e329d6571) or this (https://locusofctrl.github.io/blog/posts-output/2019-02-03-male-strategy/).
-
Implement bandit stuff (Exp3 for example, UCB, beta Thompson sampling stuff https://www.youtube.com/watch?v=XxTgX8FlDlI, https://www.cs.ubc.ca/labs/lci/mlrg/slides/Introduction_to_Bandits.pdf). A whole book on bandit lol (https://tor-lattimore.com/downloads/book/book.pdf)
-
Consider opponent modeling as well (https://abhinavrobinson.medium.com/ai-wins-rock-paper-scissors-mostly-65e0542f945b this is a nice summary, and the CMU slides somewhere as well... ( http://www.cs.cmu.edu/afs/cs/academic/class/15780-s13/www/lec/ganzfried_gt.pdf))
-
Maybe also implement some deep RL methods like Q-learning, actor-critic for these games.
-
More game: Leduc Poker, Texas Hold Em.
-
Rock-paper-scissor game visualization: https://www.kaggle.com/jamesmcguigan/rock-paper-scissors-multi-armed-bandit, https://github.com/seekpl/rps-game
Please see https://vlongle.github.io/Imperfecto/ for full documentation.
- An Introduction to Counterfactual Regret Minimization (Neller & Lanctot)
- Steps to building a Poker AI (Thomas Trenner)
- Thumbnail: an image of the French poker player Patrick Bruel. Sourced from knowyourmeme.com
- Images for the RockPaperScissor Web visualizer: https://github.com/seekpl/rps-game
MIT License @ Long Le 2022.