-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What's the workflow? #45
Comments
Don't we need to tell trueskill how many wins, or losses a specific player has? How do I do this when performing the rating recalculation? |
Why is the output from my above code modifying the sigma value when I've not done anything? Don't you need to tell it who the winner was?? |
What is the "ranking table?" |
In order to make a leaderboard, wouldn't we need to have some sort of way to compare each players score against each others? These all seem independent and related to a single match. However, we need to be able to determine skill by each and every player so we can match them with players of similar score... |
No. I suggest you read up on TrueSkill. I have no idea why you're trying to implement it with such a poor understanding of what it is and does. No offense intedended, just genuinely curious. Here is an implementation of it: it's a simple form that calls this library and reports results. And the code is here for your reference: https://github.com/bernd-wechner/TrueSkill-Server To help you out a a little: TrueSkill is a Bayesian model that, given an estimate of each players skills (modelled as a Gaussian with a mean and standard deviation) and the result of the game (taking into consideration only ranking, namely who came first, second, third etc, not scores or any other measures) will return a new estimate of skills for each of those players. That aim is, to use the outcome of one game, to update the estimate of skill we have for each of the players. It's because you seem clearly not to know this, that I wonder why you are here playing around with the TrueSkill Python implementation. Heungsub has written rather excellent documentation here: precisely to answer all these questions you have.
Again, RTFM and you'll be fine. But to help out a little when you provide the list [t1, t2] you are providing the rankings in a game, namely stating that t1 played a game against t2 and t1 came first and t2 came second. TrueSkill then calculates new ratings (skill estimates) based on the old ratings (skill estimates) given that outcome. The expected outcome is:
Why? Again, read up about TrueSkill the web is full of documentation. But in short:
If you want leaderboards, yes you track player ratings. Every time they play a game you update those ratings. The TrueSkill rating is typically µ − (µ0 ÷ σ0) × σ and explained well enough here: http://trueskill.info/help.html Then yes, you match players of comparable ratings with one another. Hope that's some help. But this is not a TrueSkill help forum, it's just a Python package kindly written Heungsub Lee and gifted to the Python community. |
Thanks for the response. I appreciate it.
Yeah that makes sense. So I would match people who are within a range with some sort of allowed deviation. Out of curiosity, the stuff I've read so far is the following: Are these good resources that I've been using so far? I guess I'm trying to understand how I should represent the players score to the player. A lot of games use ELO, for example, StarCraft 2 has an MMR rating value which is displayed to you. What value would I be displaying to players on a leaderboard so they know where they stand? The calculation of trueskill values is a bit difficult because there is two values, the µ and the sigma value. The microsoft site says Where the rating should be displayed as : mu - (k * sigma) but this would be a small value, some number between 0 and 25, which I don't feel gives enough of an understanding for a player. Also, the trueskill only becomes accurate after 46 games for a 2 team 4v4 game according to the official website. So a player who just starts out will appear much better than they are. Also, the "place" in terms of that ranks variable, is that basically "first place" and "second place" where 0 is a winner, and 1 could be a loser? I know you mentioned that this is not a help form, but you have been very helpful. Do you know where I could go to ask additional questions? I was playing around with the website you made and noticed this: Why does the rating go down to 0 for the second player? |
And the default value of http://trueskill.info/help.html The paper is not needed unless you're keen on papers and math, and the references you have are all excellent. This creates a default, or initial rating for all players of 0. Ratings on our leaderboards are hovering around the 20 mark for the top of the leaderboard currently.
Not it does not guarantee 0 to 25 and that this is trivial to change but, consider that the rating is a real number not an integer. For example on one leaderboard on of ours ratings range from -2.9 to 24.7. We choose to display them to one decimal place. The leaders rating might well be 24.68756342 for example. If you want to change the range you can tune all of the TrueSkill parameters that are again well described here: http://trueskill.info/help.html (in fact Beta should be tuned over time from recordings though the authors fail to document how and I have yet to get a response from them on that subject). But it is child's play to scale the existing rating after the fact as you wish. You can multiply it by 4, or 10 or whatever and present as an integer rather than a real number.
It didn't go down to 0 it was 0 to begin with, it just didn't go up. Old rating 0, New rating 0, Change 0. Note that ratings can and do go negative. That's because all players start with a rating of 0 and when unknown players compete, some players' ratings go up, and the other players' ratings go down. I've never seen ours drop below about -4 ... In a 2 player game they don't go negative, in a six player game the loser will go negative. You can certainly play around with that site and in fact interact with it using URL only (as an API) to get a feel for how TrueSkill behaves. Good for debugging as well ;-). And the code is all open and you can host it yourself too for fun ... all possible. Bear in mind TrueSkill is a numeric implementation of a Bayesian learning system for random variables ... nice and wordy what? The point being that it is not a simple algorithm and its results and behaviours can take some interpretation. It's not without bugs though, notably:
An inescapable property of the TrueSkill algorithm and all Bayesian algorithms and in fact any sensible skill modelling system, is that the results are not order independent. That is, the order of game results impacts ratings significantly. Again of low concern to computer gamers as the computer logs the game and time and can submit results and get rating updates. For games where the results are entered manually into a rating system, then there is risk of order betrayal ... that is getting the order wrong. Why is this relevant? Because this: Jack beats Jill gives Jack and Jill different ratings to: Jill beats Jack This is a natural property of sensible skill modelling (and of ELO and TrueSkill bich are attempts at that applying Bayes Theorem for updates). But on any manual entry system has some dire consequences in terms of data integrity. If you want TrueSkill help though I'd strongly recommend StackOverflow over GitHub issues. The Authors themselves are known to help out there. For example: I have yet to reply to Tom there ;-). But cherish his response. It's good sample of where too much delving into the math behind TrueSkill will lead you (I'm exploring draw modelling). |
I'm guessing you can close this issue now? |
Here is a sample leaderboard after 3 games from 3 players. It features the conservative rating or minimum rating. It also shows the maximum rating and number of games. The rating column is the mean or average rating. By looking at the minrating and maxrating, one can estimate that a pairing is not too bad. If minrating of strong player is above the maxrating of the other player then the pairing is bad. The players will just look the minrating and maxrating to see his limits. output
code"""
sample.py
"""
from trueskill import Rating, rate_1vs1
import pandas as pd
def main():
UNCERTAINTY_FACTOR = 3
p1 = Rating()
p2 = Rating()
p3 = Rating()
p1g, p2g, p3g = 0, 0, 0
# Report game result from 3 games.
p1, p2 = rate_1vs1(p1, p2)
p1g += 1
p2g += 1
p2, p3 = rate_1vs1(p2, p3)
p2g += 1
p3g += 1
p1, p3 = rate_1vs1(p1, p3, drawn=True)
p1g += 1
p3g += 1
# Build a rating list
# Prepare data input.
input_1 = {
'name': "John",
'tskill': p1,
'games': p1g,
}
input_2 = {
'name': "Peter",
'tskill': p2,
'games': p2g,
}
input_3 = {
'name': "Sandro",
'tskill': p3,
'games': p3g,
}
inputs = []
inputs.append(input_1)
inputs.append(input_2)
inputs.append(input_3)
# Create a dictionary of player data.
dic = {}
for i in inputs:
dic.update({i['name']: i})
pdata = []
for n, v in dic.items():
name = n
mean_trueskill = v['tskill'].mu
uncertainty = v['tskill'].sigma
games = v['games']
min_trueskill = mean_trueskill - UNCERTAINTY_FACTOR * uncertainty # conservative
max_trueskill = mean_trueskill + UNCERTAINTY_FACTOR * uncertainty
pdata.append({'name': name, 'rating': mean_trueskill, 'uncertainty': uncertainty,
'minrating': min_trueskill, 'maxrating': max_trueskill, 'games': games})
# Convert pdata to dataframe to easier handle the table.
df = pd.DataFrame(pdata)
df = df.sort_values(by=['rating', 'uncertainty', 'games'], ascending=[False, True, False]) # high rating first
df = df.reset_index(drop=True)
df.index += 1
print('TrueSkill Leaderboard')
print(df.to_string())
print()
print(f'uncertainty_factor={UNCERTAINTY_FACTOR}')
if __name__ == "__main__":
main() |
I'm trying to understand the workflow for this true skill tool.
The way I imagine doing matchmaking is my master server will attempt to find players who are already in a lobby with ratings similar to mine, within some sort of offset. However, the first question here is, what is the fair range of offsets for skill level? +50/-50? Second, what type of data should I be adding to my database tables for this to be able to be saved?
Is the workflow like this:
Do you have an example workflow of how this could be used?
From reading the official true skill documentation, it says that for a 4v4 2 team game, you would need a minimum of 46 matches to determine with a high degree of belief that this is the players score. Where is the number of matches stored in your code?
Should my database store both the Sigma and Mu values of each player? The official calculation I found online for calculating player skill is this: μ – k*σ where k = 3. I found this on the official microsoft page.
The text was updated successfully, but these errors were encountered: