Selfplay league runner with flexible configuration and algo presets #58

kachayev · 2022-02-06T01:51:59Z

This is very much WIP.

$ poetry run python run_league.py --config-file league_alphastar.yaml

or

$ poetry run python run_league.py --config-file league_openfive.yaml

For league runner to work, it's required to provide 2 entrypoints: train and evaluate. Both take as an argument path to saved agent and saved opponent.

League configuration (in YAML file) gives ability to control:

population structure
matchmaking algo (pick the next opponent)
archival and evaluation scheduling

The runner keeps track of winrates in a payoff table and MMR by running Bayesian update on TrueSkill. Information on winrates and MMR could be used for making decisions on the next opponent.

As an example, 2 presets are implemented:

AlphaStar
OpenFive (not sure if it's required to sample based on MMR), not that implementation turned out to be almost trivial based on league API (I mainly used this as a confirmation for API being flexible enough)

The league supports resume from checkpoint.

There are still a lot of issues, including

proper setup for tensorboard writer/w&b
open questions around seeding
better logger for master process
and more

It's actually pretty hard to iterate on the league runner with MicroRTS env, as it takes long time to get any training done. I'm mostly iterating on SlimeVolley, and some other PettingZoo envs. And I'm think about having league runner as a separate package that could be used as a library and/or CLI tool (from implementation perspective, it's completely independent from details of training or env that is used). WDYT @vwxyzjn?

vwxyzjn · 2022-02-07T02:26:13Z

As an example, 2 presets are implemented: AlphaStar OpenFive (not sure if it's required to sample based on MMR), not that implementation turned out to be almost trivial based on league API (I mainly used this as a confirmation for API being flexible enough)

Nice! This is really cool!

And I'm think about having league runner as a separate package that could be used as a library and/or CLI tool (from implementation perspective, it's completely independent from details of training or env that is used). WDYT @vwxyzjn?

This makes sense! Lots of projects could benefit from this :)

It's actually pretty hard to iterate on the league runner with MicroRTS env, as it takes long time to get any training done. I'm mostly iterating on SlimeVolley, and some other PettingZoo envs.

I will look further into this and evaluate.

vwxyzjn · 2022-02-08T03:21:13Z

Looked further into this. Do we have a sense of how "fast" the training is? So if we run poetry run python run_league.py --config-file league_alphastar.yaml for 24 hours, what's the trueskill of the best agent using our league.py to evaluate?

kachayev · 2022-02-08T05:48:46Z

To answer this question I need to have a GPU 😀 And based on the schedule of other experiments I will be able to run it tomorrow or the day after tomorrow

kachayev · 2022-02-09T03:22:59Z

Oh, BTW. Found another detail I have to flesh out first: right now evaluation only works against other agents, PvE games are not supported. It's quite easy to cover, so shouldn't take long

vwxyzjn · 2022-02-09T03:29:15Z

What is PvE?

kachayev · 2022-02-09T03:30:46Z

Sorry :) It stands for "player vs. environment" (like, built-in bot). In comparison to PvP, as "player vs. player"

vwxyzjn · 2022-02-09T03:37:33Z

Oh, this makes sense. It would be useful to cover! That said, hopefully, we can also train really strong agents without the help of human-engineered bots :)

League runner with multiprocessing and flexible configuration

efec329

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Selfplay league runner with flexible configuration and algo presets #58

Selfplay league runner with flexible configuration and algo presets #58

kachayev commented Feb 6, 2022 •

edited

Loading

vwxyzjn commented Feb 7, 2022

vwxyzjn commented Feb 8, 2022

kachayev commented Feb 8, 2022

kachayev commented Feb 9, 2022

vwxyzjn commented Feb 9, 2022

kachayev commented Feb 9, 2022

vwxyzjn commented Feb 9, 2022

Selfplay league runner with flexible configuration and algo presets #58

Are you sure you want to change the base?

Selfplay league runner with flexible configuration and algo presets #58

Conversation

kachayev commented Feb 6, 2022 • edited Loading

vwxyzjn commented Feb 7, 2022

vwxyzjn commented Feb 8, 2022

kachayev commented Feb 8, 2022

kachayev commented Feb 9, 2022

vwxyzjn commented Feb 9, 2022

kachayev commented Feb 9, 2022

vwxyzjn commented Feb 9, 2022

kachayev commented Feb 6, 2022 •

edited

Loading