Rankit is created for the purpose of a more "scientific" ranking of rankable objects.
We rank objects by giving objects a score, we call that score rating. Traditionally, people would generate ratings by calculating average score, median or some other statistical meaningful numbers. However, eventhough this method is widely accepted, it can have bias in some extreme cases. Average score would easily be manipulated if the number of scores are unrestricted. One cope to this cheat is weighting, but this can only leverage the problem but not solving it.
Here in Rankit, we provide a variety of ranking solutions other than simple average. These methods includes famous sports ranking solutions like Massey ranking system, Colley ranking system, Keener ranking system, Elo ranking system... Some of the methods borrow the wisdom from PageRank and HITS, and a ranking system aims to predict score difference also exists.
To further compete with ranking cheating, rankit also included ranking merging methods and provided measures to measure distance between different ranking results.
All the algorithms implemented in rankit have been described in Who's #1? The Science of Rating and Ranking. In fact, rankit is a sheer implementation of this book.
Suppose we want to generate the Massey rank of five teams from NCAA American football competition by using their scores in season 2005 (this is also the example used more than once in the book I mentioned above:)
import pandas as pd
data = pd.DataFrame({
"primary": ["Duke", "Duke", "Duke", "Duke", "Miami", "Miami", "Miami", "UNC", "UNC", "UVA"],
"secondary": ["Miami", "UNC", "UVA", "VT", "UNC", "UVA", "VT", "UVA", "VT", "VT"],
"rate1": [7, 21, 7, 0, 34, 25, 27, 7, 3, 14],
"rate2": [52, 24, 38, 45, 16, 17, 7, 5, 30, 52]
}, columns=["primary", "secondary", "rate1", "rate2"])
data
primary | secondary | rate1 | rate2 | |
---|---|---|---|---|
0 | Duke | Miami | 7 | 52 |
1 | Duke | UNC | 21 | 24 |
2 | Duke | UVA | 7 | 38 |
3 | Duke | VT | 0 | 45 |
4 | Miami | UNC | 34 | 16 |
5 | Miami | UVA | 25 | 17 |
6 | Miami | VT | 27 | 7 |
7 | UNC | UVA | 7 | 5 |
8 | UNC | VT | 3 | 30 |
9 | UVA | VT | 14 | 52 |
from rankit.Table import Table
from rankit.Ranker import MasseyRanker
data = Table(data, ['primary', 'secondary', 'rate1', 'rate2'])
ranker = MasseyRanker(data)
ranker.rank()
name | rating | rank | |
---|---|---|---|
0 | Miami | 18.2 | 1 |
1 | VT | 18.0 | 2 |
2 | UVA | -3.4 | 3 |
3 | UNC | -8.0 | 4 |
4 | Duke | -24.8 | 5 |
That's it! All the things you have to do is preparing the games data in the form of pandas DataFrame, specifying the players' columns and score columns, pick a ranker and rank!
There are a variety of ranking methods for you to choose, but what if one wants to merge several ranking results?
from rankit.Ranker import MasseyRanker, ColleyRanker, KeenerRanker, MarkovRanker
from rankit.Merge import borda_count_merge
mergedrank = borda_count_merge([
MasseyRanker(data).rank(), KeenerRanker(data).rank(), MarkovRanker(data).rank()])
mergedrank
name | BordaCount | rank | |
---|---|---|---|
0 | Miami | 12 | 1 |
1 | VT | 9 | 2 |
2 | UVA | 6 | 3 |
3 | UNC | 3 | 4 |
4 | Duke | 0 | 5 |
So that's rankit! I hope that with rankit, there will be less dispute on the cheating of ranking and common people who does not know about the science of ranking will benefit from it.
MIT Licensed.