Success report and request for help

As I mentioned in another issue, I've been working on training an AI Agent to play Othello/reversi. I wanted to report that I've had some pretty decent success using AlphaZero.jl. Much more than I was able to achieve with PYTorch, TensorFlow or Flux.jl. That's the good news. The not-so-good news is that while I've gotten a relatively good player, it's still not that great. It easily beats really bad players (like me) and can play 50/50 against a basic MinMax heuristic (translated from https://github.com/sadeqsheikhi/reversi_python_ai). 

In my training, I've done around 25 iterations (the repository is here: https://git.sr.ht/~bwanab/AZ_Reversi.jl). The loss seems to have flatlined at around 10 iteration and very gradually slopes upward after that.

Are there any particular hyper-parameters that I should look at? One thing I tried that didn't seem to make much difference was making the net a little bigger by changing the number of blocks from 5 to 8.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Success report and request for help #201

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Success report and request for help #201

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions