3D Tic Tac Toe Game in OpenAI Gym

The 3D version of Tic Tac Toe is implemented as an OpenAI's Gym environment. The learning folder includes several Jupyter notebooks for deep neural network models used to implement a computer-based player.

Complexity

The traditional (2D) Tic Tac Toe has a very small game space (3^9). In comparison, the 3D version in this repo has a much larger space which is in the order of 3^27 or 7.6 trillion states. This makes computer-based players using search and pruning techniques of the game space prohibitively expensive.

Rather, the current learning models are based on policy gradient and deep Q-learning. The DQN model has produced very promising results. Feel free to experience on your own and contribute if interested. The PG-based model needs more work :)

Contributions

The repo is also open for pull requests and collaborations both in game development as well as learning.

Dependencies

Base dependency: gym.
Plot-rendering dependencies: numpy, matplotlib.
DQN learning dependencies: tensorflow, numpy.

Installation

To install run:

# In your virtual environment
pip install gym-tictactoe

Usage

Currently 2 types of environments with different rendering modes are supported.

Textual rendering

To use textual rendering create environment as tictactoe-v0 like so:

import gym
import gym_tictactoe

def play_game(actions, step_fn=input):
  env = gym.make('tictactoe-v0')
  env.reset()
  
  # Play actions in action profile
  for action in actions:
    print(env.step(action))
    env.render()
    if step_fn:
      step_fn()
  return env

actions = ['1021', '2111', '1221', '2222', '1121']
_ = play_game(actions, None)

The output produced is:

Step 1:
- - -    - - -    - - -    
- - x    - - -    - - -    
- - -    - - -    - - -    

Step 2:
- - -    - - -    - - -    
- - x    - o -    - - -    
- - -    - - -    - - -    

Step 3:
- - -    - - -    - - -    
- - x    - o -    - - x    
- - -    - - -    - - -    

Step 4:
- - -    - - -    - - -    
- - x    - o -    - - x    
- - -    - - -    - - o    

Step 5:
- - -    - - -    - - -    
- - X    - o X    - - X    
- - -    - - -    - - o

The winning sequence after gameplay: (0,2,1), (1,2,1), (2,2,1).

Plotted rendering

To use textual rendering create environment as tictactoe-plt-v0 like so:

import gym
import gym_tictactoe

def play_game(actions, step_fn=input):
  env = gym.make('tictactoe-plt-v0')
  env.reset()
  
  # Play actions in action profile
  for action in actions:
    print(env.step(action))
    env.render()
    if step_fn:
      step_fn()
  return env

actions = ['1021', '2111', '1221', '2222', '1121']
_ = play_game(actions, None)

This produces the following gameplay:

Step 1:

Step 2:

Step 3:

Step 4:

Step 5:

DQN Learning

The current models are under learning folder. See Jupyter notebook for a DQN learning with a 2-layer neural network and using actor-critic technique.

Sample game plays produced by the trained model (the winning sequence is (0,0,0), (1,0,0), (2,0,0)):

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
gym_tictactoe		gym_tictactoe
learning		learning
media		media
scripts		scripts
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3D Tic Tac Toe Game in OpenAI Gym

Complexity

Contributions

Dependencies

Installation

Usage

Textual rendering

Plotted rendering

DQN Learning

About

Releases

Packages

Languages

shkreza/gym-tictactoe3d

Folders and files

Latest commit

History

Repository files navigation

3D Tic Tac Toe Game in OpenAI Gym

Complexity

Contributions

Dependencies

Installation

Usage

Textual rendering

Plotted rendering

DQN Learning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages