Skip to content

OldMatches

amchess edited this page Apr 30, 2023 · 1 revision

Test an engine against another

The best way is by a match based on a number of starting positions. These positions must be carefully chosen and, for this, we have to have the goal to test the two engines on the highest number of different strategic situations. Because the positions are from opening phase, we considered the following center type's configurations well known in chess literature:

  1. Closed

Example

closed

  1. Open

Example

open

  1. Mobile

Example

mobile

  1. Static

Example

static

  1. Hanging pawns

Example

hanging

  1. Small

Example

small

  1. Classical

Example

classical

  1. Isolated pawn

Example

isolated

  1. Saw Tooth

Example

saw tooth

  1. Other

Example

other

From these 10 center configurations, we created 10 databases of very high quality chess games and, after, the ten corresponding opening books. After, we realized a first battery of 40 starting chess positions, each one to play with reversed colors. The positions are 4 for center type (so, 4x10 = 40) based on the

  1. performance of the move, more or less the ratio between the success of the move and the games where it has been played
  2. probability of the move, more or less the ratio between the success of the move and the total number of games with the relative position.

In particular,

a) the first game maximize the performance for both

b) the second game maximize the probability for both

c) the third game maximize the performance for white and the probability for black

d) the forth game maximize the probability for white and the performance for black

Obviously, proceeding from a to d, we increase the sharpness of the games. So, we have in the TestSuiteCenterType folder

  1. the TestSuite40.pgn file with a, b, c and d.
  2. the TestSuite30.pgn file with a, b and c.
  3. the TestSuite20.pgn file with a and b.
  4. the TestSuite10.pgn file with a.

Clearly, from 1 to 4, we have to choice a longer time control.

We're not interested at all in very short games.

So, we used only 4: the Test Suite with 10 games (so, a match of 20 games).

We built on the formula in an old Fritz program manual:

M = 2 f t

where

  • M = hash table size in kilobytes

  • f = clock frequence in MegaHertz

  • t= average time for move

So, we didn't get under t = 60 seconds. For a chess game, we consider 30 moves. We used then a standard time control of 30 minutes for each player on a game. The problem in this is the frequent zeitnot problem on the used chess gui (Arena). So, we used a bonus, based on the following formula:

c=t+30b/60=t+b/2.c

where

  • c = total time (minutes) for each player without increment
  • t = base time in minutes for a game and for each player
  • b = the bonus(increment) in seconds,

In our case, so, we have

c=30

and, if we take, for example, t=25,

we have the following equivalence:

a 30 game (active chess) corresponds to a 25/10 game. We think this last is the best time control to test the engine in a match.

Clone this wiki locally