New NAF Glicko Rankings - Updated 21-Jan-20

News and announcements from the worldwide Blood Bowl players' association

Moderator: TFF Mods

User avatar
rolo
Super Star
Super Star
Posts: 1188
Joined: Wed May 27, 2015 9:38 am
Location: Paradise Stadium, where the pitch is green and the cheerleaders are pretty.

Re: New NAF Glicko Rankings

Post by rolo »

Itchen Masack wrote:26 races, 2 sets of rankings. We have 52 World Number 1's :D
I suspect that ELO and Glicko will agree more often than not. For example, Jimjimany is the top Wood Elf player in both rankings. Pipey is the top Norse player in both rankings.

And that shouldn't be surprising, neither ranking system is random black magic. Win games, your ranking improves. Lose and it goes down. The only way to place high in either ranking is to play, and win, a lot of games.

I suspect that the biggest difference is going to be in the middle. Glicko becomes more confident of a ranking the more games are played - so while playing (for example) 300 games - 100 wins, 100 ties, 100 losses, is about the same in ELO as not playing at all (~150 ranking), that makes a huge difference in Glicko. Glicko ratings apparently (?) also degrade over time, so winning a bunch of games years ago, and then never playing again, is worth less in Glicko than in ELO.

Reason: ''
"It's 2+ and I have a reroll. Chill out. I've got this!"
Image
dode74
Ex-Cyanide/Focus toadie
Posts: 2565
Joined: Fri Jul 24, 2009 4:55 pm
Location: Near Reading, UK

Re: New NAF Glicko Rankings

Post by dode74 »

I still have a pack of those...

Reason: ''
straume
Emerging Star
Emerging Star
Posts: 364
Joined: Fri Mar 28, 2014 9:21 am

Re: New NAF Glicko Rankings

Post by straume »

Question: I saw there is some connection between races. Is this the assumed ability (mu)? Or the distribution (phi)? And is it just for the starting point, or does it continue to infer between the races as we play?

In other words: Would my Dark Elf-ranking be adjusted (either by mu or phi) when I play Khemri?

Reason: ''
User avatar
mubo
Star Player
Star Player
Posts: 749
Joined: Mon Dec 22, 2008 7:12 pm
Location: Oxford, UK

Re: New NAF Glicko Rankings

Post by mubo »

straume wrote:Question: I saw there is some connection between races. Is this the assumed ability (mu)? Or the distribution (phi)? And is it just for the starting point, or does it continue to infer between the races as we play?

In other words: Would my Dark Elf-ranking be adjusted (either by mu or phi) when I play Khemri?
Nope. Don't worry about affecting your (terrific) dark elf rating.

The cross talk is just when you start a new race. I take the mean mu, and the max phi of your existing rankings. So your Khemri will start at mu/phi ~1650/164.7 (mu based on rough in my head maths). I recognise this is a crude approach, but this is alpha. Plus still represents an improvement on 1500/350.

Reason: ''
Glicko guy.
Team England committee member
straume
Emerging Star
Emerging Star
Posts: 364
Joined: Fri Mar 28, 2014 9:21 am

Re: New NAF Glicko Rankings

Post by straume »

Thanks for the swift reply.

Mean Mu makes sense. Now that is sentence I never thought I would see the light of day.

However....you said, max phi and 110? My current highest phi is 164.7 (Orcs played once in 2014)? Did you mean mean phi?

Reason: ''
User avatar
mubo
Star Player
Star Player
Posts: 749
Joined: Mon Dec 22, 2008 7:12 pm
Location: Oxford, UK

Re: New NAF Glicko Rankings

Post by mubo »

straume wrote:Thanks for the swift reply.

Mean Mu makes sense. Now that is sentence I never thought I would see the light of day.

However....you said, max phi and 110? My current highest phi is 164.7 (Orcs played once in 2014)? Did you mean mean phi?
I use the max. I figure since you've never played tournament Khemri, and you have played tournament orcs, we can't have a better idea of your Khemri ability than your orc ability. Which I think is fair? Happy to listen to sensible petition on this point though.

Reason: ''
Glicko guy.
Team England committee member
CyberedElf
Veteran
Veteran
Posts: 257
Joined: Fri May 31, 2013 12:52 am

Re: New NAF Glicko Rankings

Post by CyberedElf »

I think mean mu is a good starting point, but I think it can be improved. I think it can be adjusted for the mean mu of the races involved. Since new coach race calculations are relatively few compared to ongoing calculations, I think a little more overhead can achieve a slightly better result. I think you could create a delta for each previous race compared to the average of all coaches that have used that race. The average of this delta could then be used to adjust the average of all coaches for the new race.
mubo wrote:I figure since you've never played tournament Khemri, and you have played tournament orcs, we can't have a better idea of your Khemri ability than your orc ability. Which I think is fair? Happy to listen to sensible petition on this point though.
I agree that the khemri estimate can not be more accurate than the orc estimate, but does that justify saying that it isn't worse? I would probably start new teams at the initial phi regardless, only adjusting new mu away from average.

I love the Glicko system and I think it is a definite improvement, but I'm curious why you chose to use decay. Why does a coach not playing prove that their estimate is less significant? In physical sports this makes more sense. We don't know if the player is training as consistently etc. For more mental contests, I would argue that the decay is much less significant. I know for leaderboards it is important to have a high decay to keep positions temporary, but to me this environment does not argue for a middle choice.

Why chose 100 phi as the cutoff? The description from the NAF overview page says "it will usually take a couple of tournaments with a race before phi drops below 100, and we become confident enough of ability to provide a rank." One of my own races has played 7 tournaments over 4 years and is considered inactive. To me, that does not appear to be working as intended based on the description. Maybe it is, maybe it's the decay, maybe my opponents had a high mu. It just seems odd to me. Is there a theoretical reason for 100 being the cutoff?

Thank you for doing this!!! I'm really just being picky with my above statements. I think what you did is great.


Edited a mu to phi typo.

Reason: ''
Image
User avatar
Darkson
Da Spammer
Posts: 24047
Joined: Mon Aug 12, 2002 9:04 pm
Location: The frozen ruins of Felstad
Contact:

Re: New NAF Glicko Rankings

Post by Darkson »

CyberedElf wrote:Why chose 100 mu as the cutoff? The description from the NAF overview page says "it will usually take a couple of tournaments with a race before phi drops below 100, and we become confident enough of ability to provide a rank." One of my own races has played 7 tournaments over 4 years and is considered inactive. To me, that does not appear to be working as intended based on the description. Maybe it is, maybe it's the decay, maybe my opponents had a high mu. It just seems odd to me. Is there a theoretical reason for 100 being the cutoff?
I asked a similar (but much less detailed) question on the NAF forum, which may or may not partially answer yours:
Darkson wrote:When is (was?) the phi calculated from? As as it stands, only one of my races is under 100 (and therefore active), and while I'd in now way count myself as a prolific, or even regular, coach, I have played 8 different races in the last 12 months (I'm not counting Skaven here, as I assume this is only for 'standard' BB, not the variants?)?
Purplegoo wrote:There is an article in the works describing the development of the system, which will probably answer your question. While you may have played races in the last year, phi takes more into account than when you last used a team. We think it’s working OK, but will be reviewing as we go along.
I didn't question it further as my understanding of it is limited, but if working as intended it doesn't seem to add much for those that can't make dozens of events a year (which is a shame imo, as I like the idea).

Reason: ''
Currently an ex-Blood Bowl coach, most likely to be found dying to Armoured Skeletons in the frozen ruins of Felstad, or bleeding into the arena sands of Rome or burning rubber for Mars' entertainment.
User avatar
Purplegoo
Legend
Legend
Posts: 2259
Joined: Mon Jan 14, 2008 1:13 pm
Location: Cambridge

Re: New NAF Glicko Rankings

Post by Purplegoo »

I'm not really sure how you got from that reply to that conclusion, but with any luck the article I mentioned will clear up any misunderstanding around phi. There's more to it than (in)activity!

Reason: ''
User avatar
Darkson
Da Spammer
Posts: 24047
Joined: Mon Aug 12, 2002 9:04 pm
Location: The frozen ruins of Felstad
Contact:

Re: New NAF Glicko Rankings

Post by Darkson »

Nope, the article didn't help at all. For example, I've played Halflings and Lizards both twice in the last 12 months, but only lizards show as active (just!) and I had a good year, managing to make 9 events (with 7 races iirc). If that's not going to be enough to make my teams active (and I can't always make that many in a year [stupid world where I have to work] ) then it's not going to be much use to me and people with similar numbers of tournaments (and note again, I like this innovation, so this isn't me just poo-pooing something new).

If the system means i, as a low attendance coach, need to concentrate on one or two races, then that's a minus point for me - I know I'm never going to top the rankings so I like variety.

If there's something else that I'm missing (which I freely admit I probably am) then please explain, as the article left me in the dark with the reply you have on the NAF.

Reason: ''
Currently an ex-Blood Bowl coach, most likely to be found dying to Armoured Skeletons in the frozen ruins of Felstad, or bleeding into the arena sands of Rome or burning rubber for Mars' entertainment.
User avatar
Purplegoo
Legend
Legend
Posts: 2259
Joined: Mon Jan 14, 2008 1:13 pm
Location: Cambridge

Re: New NAF Glicko Rankings

Post by Purplegoo »

The article I speak of is still in the works in that it's yet to be published (eta soon). I'm not sure to what you're referring there (perhaps the OP or a version thereof?), but we're not talking about the same thing. Hence the 2+2 = 5 in your previous post, I think.

You're not the only person to seemingly equate phi entirely to activity and question what constitutes playing 'enough' to be ranked. Activity is part of, but not all of, it and I hope upon article publication your concerns / questions will be answered. While Glicko (like Elo before it) will likely not speak to 100 % of coaches, it won't require you to attend 'dozens' of events a year to get a ranking. Phi is calculated in a more nuanced manner than that. This is still super new and in alpha as Nick mentions above, as I say, more comms / improvements are incoming. ;)

Reason: ''
User avatar
mubo
Star Player
Star Player
Posts: 749
Joined: Mon Dec 22, 2008 7:12 pm
Location: Oxford, UK

Re: New NAF Glicko Rankings

Post by mubo »

CyberedElf wrote:I think mean mu is a good starting point, but I think it can be improved. I think it can be adjusted for the mean mu of the races involved. Since new coach race calculations are relatively few compared to ongoing calculations, I think a little more overhead can achieve a slightly better result. I think you could create a delta for each previous race compared to the average of all coaches that have used that race. The average of this delta could then be used to adjust the average of all coaches for the new race.
I'm inclined to agree broadly. I guess part of the reason is that I don't like the idea of using "future" data when I'm starting players off in 2005 etc. It's also logistically difficult to also maintain a ranking of all races at the same time. It's not easy to find an appropriate mu for the coach and for the race without doing lots of regressions, which isn't something I feel like doing desperately! Re: phi, I think 350 is too much. That's really when we have no idea about someone, in straume's case, I think 164 is totally reasonable for him with a new race.
CyberedElf wrote:About phi < 100
You mean phi here. I appreciate it's a little unclear. The big issue is that when phi is calculated it depends heavily on the phi of the players you play. It gives us no information if you beat a phi=350 player, the system has no idea if you've beaten a good club player or someone's dog. Unfortunately due to relatively low density of games (especially repeat games vs same players) in the US etc, this tends to feel harsh on US players. We're hoping to explain this a bit better in the coming weeks.

Re: darkson's point- absolutely correct. This system does make it difficult to get active ratings with many teams. Most people will only have 3/4. Although I think Lycos has 13! I think this is ok- to be confident in a ranking you have to play that team regularly.

Also, everyone still gets a rating. You just may not be included in the "global rank".

Reason: ''
Glicko guy.
Team England committee member
User avatar
Purplegoo
Legend
Legend
Posts: 2259
Joined: Mon Jan 14, 2008 1:13 pm
Location: Cambridge

Re: New NAF Glicko Rankings

Post by Purplegoo »

mubo wrote:Also, everyone still gets a rating. You just may not be included in the "global rank".
I think this is a great point to keep front of mind. Everyone gets a rating. If you played a bunch of games with Lizards in 2013 and have moved on since, you'll still get an idea of how the system rates you by investigating your mu against the currently active Lizard players, for instance. That stuff is all still in there, it's just not particularly sure about you in 2018, which is fair enough.

I'm a variety player too. I don't get to dozens of tournaments these days, my phis are all going to drift upwards compared to those of similarly active coaches that have 3-5 races they love and stick to. It's not unreasonable the system is more sure about them than it is me, although I appreciate it's new and it will take a little time to settle in. I don't use a race more than twice a year and have managed 12 'active' rankings, but I guess the tournaments I play at generally feature coaches with low phis. I don't think it currently takes too much 'activity' to get an active ranking - it's as much about other factors. To be expanded upon in the oft mentioned article / discussion. ;)

Reason: ''
CyberedElf
Veteran
Veteran
Posts: 257
Joined: Fri May 31, 2013 12:52 am

Re: New NAF Glicko Rankings

Post by CyberedElf »

mubo wrote:I'm inclined to agree broadly. I guess part of the reason is that I don't like the idea of using "future" data when I'm starting players off in 2005 etc. It's also logistically difficult to also maintain a ranking of all races at the same time. It's not easy to find an appropriate mu for the coach and for the race without doing lots of regressions, which isn't something I feel like doing desperately! Re: phi, I think 350 is too much. That's really when we have no idea about someone, in straume's case, I think 164 is totally reasonable for him with a new race.
I had assumed the rankings had been built by starting at the beginning and adding one game at a time. So the "current" rank of all teams when a new race is added would be available. Yes, I am aware that is re-running everything you did to get to now (if how I think you got to current data is correct). I just had a revelation, and some of what I wrote above is not 100% correct. I actually think it would be very interesting to recalculate the data using an additional dummy coach. That coach would be treated as if all games were played by same coach. This doubles the processing required!, but gives you a race mu for the new coach/race calculation and I think it would be interesting to have total race mu to compare the races.

New phi: I think 350 is absolutely correct. Yes, you do have an idea about someone, but that is why they are given mu not equal to 1500. Phi is confidence in the value of mu. I would argue that what we know about a coach is sufficient to change mu, but not sufficient to give confidence in the untested mu. Also, this could be easily gamed. A coach could play his first race to a low phi, and then all new races would start with a good phi. Where someone who played two races (one race only once), a race with low phi and one with high, will have all their new races start at a high phi. That doesn't make sense.

You agree that a new phi can not be better than their worst phi. Why is it reasonable to never think it could be worse?

I do eagerly await the detailed document.

Reason: ''
Image
User avatar
mubo
Star Player
Star Player
Posts: 749
Joined: Mon Dec 22, 2008 7:12 pm
Location: Oxford, UK

Re: New NAF Glicko Rankings

Post by mubo »

CyberedElf wrote: I had assumed the rankings had been built by starting at the beginning and adding one game at a time. So the "current" rank of all teams when a new race is added would be available. Yes, I am aware that is re-running everything you did to get to now (if how I think you got to current data is correct). I just had a revelation, and some of what I wrote above is not 100% correct.
That's true. The data is structured per player. To work out the mean ranking per race is doable, but then you have to regress coach "ability" plus race "ability". That extra effort in coding + cpu not worth it (for me). If someone wants to submit a PR would be very happy to include.
CyberedElf wrote: You agree that a new phi can not be better than their worst phi. Why is it reasonable to never think it could be worse?
It's not, but it's an approximation, and crucially a better one than an established player starting at 350. It only makes a major difference in cases where coaches have one race they *always* take, then take a second and get an artificially low phi. Again, if anyone wants to suggest an alternative approach, happy to listen. I might add a min so that a new phi is never below 100 for example.
PS: don't think of phi as better/worse, but higher/lower more confident/less confident.

Reason: ''
Glicko guy.
Team England committee member
Post Reply