OnitamaAI : An Artificial Intelligence Implementation of the Onitama Board Game

This repository showcases multiple traditional and Deep RL approaches to creating an artificial agent that plays Onitama proficiently. The Minimax Depth 5 and above models were found to be unbeaten by any human that competed against it.

The following algorithms were Implemented:

Minimax with Alpha-Beta Pruning (single and multi-processor versions)
Monte-Carlo Tree Search
Deep Deterministic Policy Gradient
Deep Double Dueling Q Networks

The full details and investigation of the implementations of the algorithms 1 & 2 can be found in TraditionalAI_Report.pdf and the details for algorithms 3 & 4 can be found in DeepRL_Report.pdf

Setup

To set up, clone this repository and have a Python 3.6.9 environment with the following libraries installed:

torch==1.8.1
numpy==1.18.5
seaborn==0.11.1
matplotlib==3.3.3
tqdm==4.59.0
torchsummary==1.5.1

This board game is set up as a Markov Decision Process (MDP). An example of a MDP is as shown:

Traditional AI Gameplay

To play against the Minimax and MCTS agents, run the play_onitama function form play_onitama.py

play_onitama(game_mode, first_move  =  None, verbose  =  1, minimax_depth  = 
None, minimax_depth_red  =  None, minimax_depth_blue  =  None, aivai_turns  =
500, timeLimit=None, iterationLimit=None, iteration_red  =  None, 
iteration_blue  =  None, mcts_efficiency  =  "space", parallel  =  None)

The game_mode controls the type of the two players. It can be a minimax vs minimax or a player vs MCTS and so on. The strengths of the AIs can be set in the other parameters. MCTS paired against Minimax leads to the following matrix:

Deep RL Training

To train the Deep RL agents, go to Train.py and adjust the parameters in the onitama_deeprl_train function. An instance of the function is as shown below:

onitama_deeprl_train("train", "DDPG", 10000, "insert_training_name", "minimax", 1, 
discount_rate = 0.99, lr_actor = 0.001, lr_critic = 0.001, tau = 0.005, 
board_input_shape = [10, 5, 5], card_input_shape = 10, num_actions = 40, 
max_mem_size = 1000000, batch_size = 128, epsilon = 1, epsilon_min = 0.01, 
update_target = None, val_constant = 10, invalid_penalty = 500, hand_of_god = 
True, use_competing_AI_replay = False, win_loss_mem_size = 1000, 
desired_win_ratio = 0.6, use_hardcoded_cards = True, reward_mode = 
"simple_reward", minimax_boost = 1, mcts_boost = 5000, plot_every = 1000, 
win_loss_queue_size = 100, architecture = "actions_only", moving_average = 50, 
verbose = False, valid_rate_freeze = 0.95)

The algorithm can be set as either DDPG or D3QN. The competing agent used to train against the agent can be either minimax or MCTS. The Minimax or MCTS competing agent will boost its strength once in the last win_loss_queue_size episodes, the DeepRL agent wins at a rate higher than desired_win_ratio. The architecture of the neural network (refer to DeepRL_Report.pdf for the details) can be set as well. The board state for Deep RL is represented as such:

And the card state as such:

One example of a neural network architecture used is the following, where the validity and actions branch separately and multiply to the final output layer. More details can be found in DeepRL_Report.pdf.

Collaborators

Arijit Dasgupta

Chong Yu Quan

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Saved_Models		Saved_Models
__pycache__		__pycache__
experiments		experiments
garbage_Saved_Models		garbage_Saved_Models
garbage_Training_Plots		garbage_Training_Plots
Agent.py		Agent.py
DeepRL_Report.pdf		DeepRL_Report.pdf
LICENSE		LICENSE
MCTS.py		MCTS.py
MDP.png		MDP.png
NN.py		NN.py
Onitama_source.py		Onitama_source.py
README.md		README.md
TraditionalAI_Report.pdf		TraditionalAI_Report.pdf
Train.py		Train.py
board_state.png		board_state.png
card_state.png		card_state.png
minimaxVmcts.png		minimaxVmcts.png
onitama_board.png		onitama_board.png
play_against_ai.py		play_against_ai.py
play_onitama.ipynb		play_onitama.ipynb
play_onitama.py		play_onitama.py
replay_buffer.py		replay_buffer.py
requirements.txt		requirements.txt
std_imports.py		std_imports.py
val_branch_actions_multiply.png		val_branch_actions_multiply.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OnitamaAI : An Artificial Intelligence Implementation of the Onitama Board Game

Setup

Traditional AI Gameplay

Deep RL Training

Collaborators

About

Releases

Packages

Languages

License

arijitnoobstar/OnitamaAI

Folders and files

Latest commit

History

Repository files navigation

OnitamaAI : An Artificial Intelligence Implementation of the Onitama Board Game

Setup

Traditional AI Gameplay

Deep RL Training

Collaborators

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages