Skip to content

Latest commit

 

History

History
55 lines (32 loc) · 1.47 KB

README.md

File metadata and controls

55 lines (32 loc) · 1.47 KB

Tennis In-Game Win Probability with R

An experiment in building an in-game win probability model for tennis matches. Uses XGBoost.

Sample match

Venus v Sabrina

Federer v Nadal

Accuracy

Women's model

Accuracy Women

Men's model

Accuracy Men

Feature Importance

Women's model

Feature Importance Women

Men's model

Feature Importance Men

Reference

Data is from the Match Charting Project.

Development notes

Run on the command line:

$ R --no-save < tennis-win-probability.R

Data cleanup tasks to do:

  • Records are numbered by point Pts (approximately 100 per match)
  • Set1 and Set2 are sets won by player 1 or two
  • Same for Gm1 and Gm2
  • The model uses points, games, and sets
  • The identity of the player serving the ball is not currently included in the model
  • Add estimated points (EPA) for potentially even greater accuracy