Ranking for Challenge Leagues

VoodooMike · Post by **VoodooMike** » Fri Apr 29, 2016 5:00 am

Now, I'll preface this by saying I don't particularly like challenge-based leagues - my preference is matchmaking, and by now most people know my thoughts on how to make that work. I do love trying to solve problems, so to that end I decided to take a swing at the challenge ranking issue.

The Problem

Generally speaking, challenge-based leagues cannot generate any sort of reasonable ranking because teams are able to pick and choose their opponents. In many ways this is the same problem matchmaking leagues face if people are able to concede during the pre-match phase, after being able to examine their opponent's roster. Your win% is meaningless if you can simply refuse matches you think you'll lose. What does it matter if you've got a 100/0/0 record if all 100 games were your Wood Elf team against Goblins 1500 TV lower?

The Theory

For a ranking to be meaningful it has to take into account the things that ordinarily make something like win% meaningless in these environments. The two main issues I see in being able to select your opponent are the ability to control which rosters you face, and the ability to control not only TV difference, but the direction of that difference.

The data we've collected in the past few years has indicated that certain rosters have an easier time winning against certain other rosters, and that having a higher TV than your opponent likewise increases your likelihood of winning. To this end, challenge environments give people the ability to influence the outcome of their matches by selecting opponents they are most easily able to defeat.

To make a ranking that isn't meaningless we need something that promotes "positive play" which I'm defining as.. pretty much the opposite of the sort of cherry-picking one is able to do in challenge environments. So, the ranking system should rank people who play a more diverse set of opponents higher than those who restrict themselves to the same rosters over and over... it should also rank people who play closer TV games, or even games in which they are the TV underdog, higher than those who always play at a TV advantage. It should also be able to properly rank people on their performance within the context of those things.

My first thought was that the ranking number would be additive rather than based on a simple distribution. Each roster would have its own set of "points" based on a team's performance when facing that roster, and the ranking number would be a sum of the points for each roster. This allows us to create a cumulative distribution of performance for a given roster, but limits the possible ranking number based on the number of rosters a team opts to play against.

Match Points

The number of points a match is worth is based on a base value and then adjusted for TV difference.

Base Values
-----------
Win: 100pts
Draw: 50pts
Loss: 25pts

TV Adjustment
-------------
Base Value * (1 - (0.7 * (TV advantage / 1000)))

"TV advantage" is calculated by subtracting your opponent's TV from your TV and getting a number between -1000 and 1000 (anything outside those ranges is set to the closest bounded number). The 0.7 number is based on the theory that inducements limit the chances of one side winning to, at the very least, 30%, which suggests a 70% maximum possible advantage for high TV differences.

Concessions are treated as 0 points for the team that concedes, making it significantly worse than finishing the game with a loss. Concessions are not counted toward the winner's ranking.

Roster Points

For each possible roster a team might face, a set of points is accumulated using the Match Points system. For the first five (5) matches played against a roster, each match's points are divided by 5 and added to the roster points. This means that for the first 5 games each game will increase the total roster points whether you win or lose.

For the 6th through 9th matches against a roster, the existing total is multiplied by (x-1)/x where x is the number of games played against that roster, and the new games points are divided by x and added to that total.

For the 10th game and all subsequent games the same formula is applied, but x is always 10. This means that from the 10th game on, each new game with that roster accounts for 10% of your points for that roster, slowly decreasing the contribution of older games over time. This keeps the roster-based rating weighted in favour of more recent matches.

Total Ranking Number

The total ranking number is the Roster Points for each roster, added together.

Improvements

The above system is made for simplicity more than it is for accuracy. In its simplest form it will still favour agility teams over bash teams as they are more likely to win games than their less agile counterparts. Similarly, the relationship between TV difference and match outcome advantage is not actually linear.

We can improve accuracy by sacrificing simplicity. By taking the roster vs roster win%s from a large dataset (preferably one created from matchmaking) we can set different base values for match points based on the general "win rate" of one roster versus another. To set those base values we would use the formula:

Roster-specific win points: 200 * (1 - (win rate vs. roster))
Draw: win points / 2
Loss: win points / 4

So, for example, if the data says that, on average, Wood Elf teams have a 0.57 (57%) win rate against Chaos, then the win/draw/loss points for matches played by wood elf teams against chaos teams would be 86/43/21 while chaos vs wood elf would be 114/57/28

The TV advantage adjustment can also be improved upon by replacing the calculation with the curvilinear regression equation calculated from the dataset to replace the 0.7 * (adv / 1000) aspect. This will adjust the expected advantage in a much more precise way.

Possible Problems

One issue I see is with getting higher TV teams to accept challenges from lower TV teams. Without a special casing system in place it is possible for a team to lose points even if they win when facing a team that has a much lower TV. This only happens if the higher TV team rarely plays against teams of significantly lower TVs.

To solve this I suggest that the challenge system not alter the higher TV's roster points negatively for a win only in cases where the higher TV team was the one being challenged. Thus, if a lower TV team challenges a higher TV team to a match, and the higher TV team accepts and wins... and that win would, based on the above system, decrease that higher TV team's points... it instead leaves them unaltered.