Improvement to Strength of Schedule

Discuss teams, ride/hotel sharing, trash talk, and event results here

Moderators: lunchmoney, TFF Mods

User avatar
Purplegoo
Legend
Legend
Posts: 2256
Joined: Mon Jan 14, 2008 1:13 pm
Location: Cambridge

Re: Improvement to Strength of Schedule

Post by Purplegoo »

Thanks, I'll take a look later.

I suspect Google is safe for a while. ;)

Reason: ''
Wulfyn
Emerging Star
Emerging Star
Posts: 323
Joined: Mon May 19, 2014 9:33 pm

Re: Improvement to Strength of Schedule

Post by Wulfyn »

There are a further 2 and a half problems to strength of schedule that I would like to discuss:

1. Strength

The purpose of strength of schedule is to reward players that faced harder opponents to reflect that they will have had a tougher run of games than someone on an equal standing. However this concept assumes that whatever system you use to determine the strength in your schedule is a true reflection of the ability of the opponent that you have played. When you use how well all of your opponents did in that one tournament then things start to break down because the final tournament placings of your opponents is not necessarily a reflection of their true ability.

For example at the recent dungeonbowl, to pick on 4 friends, Greshvakk and JBone finished ahead of Joemanji and Jimjimany. Gresh and JBone are very good players that are relatively new having like many of us recently returned to the hobby and are improving every tournament they go to, but I am sure they will not feel insulted if I said that I would have had an easier time of a tournament had I faced them rather than the multiple internationally capped Joe and Jim.


1.5 Strength of Schedule of Schedule

This is a related point to the first one regarding a difference between the assumptions and reality. We acknowledge that Player A may have had a harder time of it because of the opponents he faced which is one of the things that SoS is trying to achieve. However we do not allow for that for any of his opponents, whose (tiebreaker) score is a reflection of it.

So if Player A, who is a strong player who normally gets 8/12 points (2/1/0 no BP), finished instead on 4 points because he played 4 very strong players, 1 average player, and 1 weak player then he gets the benefit of a high SoS because of the 4 very strong players he faced. But also his 4 tournament points is added into the SoS of the average player that he faced and beat and so the average player is not recognised for having played a stronger player who just happened to get a tough draw. Thus to be fair if you were to use tournament points for SoS you would need to have a SoSoS where the SoS points are modified by the opponents your opponents faced. And that gets very complicated.

This really is a problem of the randomness of Swiss in large tournaments with few rounds, and something like McMahon would remove the need to do this as you'd be facing more similar opponents throughout (but that also makes SoS more volatile).


2. Relative strength

When we look at SoS we consider the difficulty of the game to be based upon the skill of the opposing player. As players will finish with a tournament score of 0-12pts (in a 2/1/0 no BP simple system) we can see a strict relationship between points and strength that should be linear. A player with 3 points should be 50% better than a player with 2 points because you are getting 50% more points added to your SoS. If we extend this we can see that this relationship cannot be true, simply because players with 0 points are infinitely easier than players with any other points - and that is impossible. It cannot be a linear relationship because at some point the relative distance would put the win chance as either above 100% or below 0%.

I've not looked too deeply into the numbers behind this but these sorts of situations tend to follow a certain distribution, which is an S-curve.

Image

Now statto's will recognise this as the cumulative normal distribution, and that is why it is common in situations like this. The importance is that as you face people that are very weak or very strong the chance of winning tends to flatten to 100% or 0% respectively. And this is on a relative basis. So players that are very strong will tend to see more opponents as very weak and players that are very weak will tend to see more opponents as very strong. In the chart you would have chance of winning on the y-axis (0% at bottom, 100% at top) with the curve tending to but never reaching it. On the x-axis you would have relative strength of player with much stronger on the left and much weaker on the right.

This means that to a very strong player there are far more people squashed up on the right hand side of that chart with them having a near 100% chance of beating them. The difference between a 3pt and 2pt player in terms of SoS is much larger than the difference in the chance of beating those opponents. What makes it worse is that players who are much better and can give you a good game might also end up 1pt apart but have a larger relative difference in how likely you are to beat them.

This generally makes SoS much better a differentiator for players who tend to finish in the middle, but quite weak for players at either end. Very strong players will tend to find a lot of their opponents much weaker, but those same opponents do not find the same thing because the relationship is non-linear. This means that if SoS is a preferred way of separating tournament winners it has an inherent weakness by not adjusting for strength vs. score.

Reason: ''
User avatar
Vanguard
Super Star
Super Star
Posts: 922
Joined: Sun Jun 08, 2008 8:27 am
Location: Glasgow
Contact:

Re: Improvement to Strength of Schedule

Post by Vanguard »

One variation on SoS that I believe is quite common, is to discount the highest and lowest results from the total. So for the purposes of a TieBreaker, rather than your SoS being the total points of your six opponents, it would only be the total of four of them, with the highest and lowest contributor removed. This mitigates the impact of a particularly easy or hard draw in the early rounds. The downside being that you are now trying to differentiate coaches based on only four matches (or three if it is used for round six seeding).

There are also various ways of calculating SoS in the first place, including versions that take into account your opponent’s opponents. For example, US College Football used to calculate SoS as (2(OR)+(OOR))/3 where OR is your Opponent’s Record and OOR is your Opponents Opponent’s Record.

Reason: ''
Image
Image
User avatar
mubo
Star Player
Star Player
Posts: 749
Joined: Mon Dec 22, 2008 7:12 pm
Location: Oxford, UK

Re: Improvement to Strength of Schedule

Post by mubo »

Some interesting points there Dan. I absolutely agree that for SOS to work, that the relationship should be linear between position and strength. My hunch is that in a 6 game BB tournament it's not very well correlated.

I don't think your hypothetical example is quite waterproof. If a good player gets only 4 points, and has played good players, then according to Swiss they must have had poor results. The chance of several good players recording poor results is pretty slim. It possibly happened at DB, but I think a bad run of dice in a dice game is more likely, which makes this the same point as above. Strength vs placing is not linear. Some simulations would be really interesting.

Something else we should consider is that Swiss places undue influence on the first of games in SOS and it diminishes thereafter. (I think Paul made this point at DB). ie if 2 players are tied for first, they will likely both have played the players around 3-6 in rounds 4-6, and therefore the SOS is most dependent on who they drew R1 (and to a lesser extent 2+3). Should you get the tiebreaker for beating a much weaker opponent as opposed to a much weaker one? It depends on the shape of your S curve I guess ;)

Reason: ''
Glicko guy.
Team England committee member
Post Reply