Examining the NAF meta

News and announcements from the worldwide Blood Bowl players' association

Moderator: TFF Mods

straume
Emerging Star
Emerging Star
Posts: 364
Joined: Fri Mar 28, 2014 9:21 am

Re: Examining the NAF meta

Post by straume »

plasmoid wrote: Whether this massive difference is due to newbies playing them more or something else as not something that can be answered, is it? I'm only pointing out that there has long been a theory that "newbies drag down the numbers for orcs" and there seems to be an unusually high number of orc players.
Actuall I think I read somewhere that someone (Wulfyn?) did a test on this and compared Orc results for managers with 50+ NAF matches and those with 50- and found no difference in performance.

Reason: ''
Wulfyn
Emerging Star
Emerging Star
Posts: 323
Joined: Mon May 19, 2014 9:33 pm

Re: Examining the NAF meta

Post by Wulfyn »

Plasmoid, I actually did a little investigation along similar lines to you a while back whilst Team England was looking at what races to take for Eurobowl. The problem that you have in your analysis is that it is too general to be really useful. That is to say that who cares if a race is really good at beating up stunties when if you are looking to place high in a tournament or go to something like Eurobowl you're not going to face them. When you are balancing the races, who are you balancing for? The person who is looking to win, or the person who finishes half way? The balance will be different because the teams faced will be different. It's why some teams are thinking about dropping Lizards from their UKTC 4 - in a format where you will meet Wood Elves 25% of the time Lizards get bummed. Right in the skink hole.


https://docs.google.com/spreadsheets/d/ ... 1765530758

The idea here was to look at what races performed well in different competitive environments. The base data will be a bit different to yours so it won't align (and because I was looking at how well teams did against others mirror matches were deliberately kept in), but the second tab is the important one as it allows you to weight the teams you will face.

This is what helped (in addition to some excellent players who are far more experienced than I) lead to the Team England thought process that (especially in a Eurobowl style environment) the teiring is:

1a Woodies, Undead, Lizards, Dark Elves
1b Dwarves, Amazons
1c Chaos Dwarves, Norse, Skaven

2a Necromantic, Humans
2b Chaos Pact, High Elves, Pro Elves, Orcs

3 Everything else (there are tiers within this but we didn't bother spending time on them as they are all much of a muchness when it comes to Eurobowl, no matter how much DocMaxx likes Slann!! :D )


But you can see that there is a rolling meta. By default we looked at what Team England took to see what teams would counter us well, and Chaos Dwarves came quite high. Right now we are discussing whether a team like Necromantic is too volatile (Lycos had an MVP 5-1-0 in Orebro followed by an underwhelming Porto; not his fault he is a great player, but that's what a team like Necro gives you). Interestingly when you look at just the top 4 teams Amazons do incredibly well, and Lizards do incredibly poorly (cos Wood Elves). Also this data is old, and a lot of it was taken before people figured out Dark Elves, so you even have a temporal meta to contend with.

That leads to another interesting thing with your meta-analysis, if you change your tiering are you just swapping one meta for another without really balancing things? We saw at last year's NAF how the uptake of the Eurobowl rules meant that we just swapped out a Tier 1 race for Necromantic. Nothing was balanced, it was just swapped. So you have to be careful in your adjustments as you need to think a few moves ahead!

Perhaps most interesting (at least to a nerd like me), is that when you pick Am, CD, DE, Dw, Li, No, Un, and WE the top 8 teams that do well against you are Un, WE, Li, CD, Dw, DE, No, Am. Is this the equilibrium we have been searching for?

straume wrote:Actuall I think I read somewhere that someone (Wulfyn?) did a test on this and compared Orc results for managers with 50+ NAF matches and those with 50- and found no difference in performance.
Yep! I did this to specifically test the hypothesis that Orcs perform poorly because new players take them, and the most important part of any analyst's job is to disprove their own ideas! Again the numbers won't align with Plasmoid's as I did this a while ago, but Orcs ranked 14th overall. When I restricted their results just to Orc players who had played in 60+ tournament games they came... 13th. The real surprise was Humans who overall ranked 15th, but when in the hands of an experienced player performed an excellent 8th. Undead, on the other hand, dropped from 2nd to 5th! So whilst it is true that Orcs are generally played just by newer players (only Chaos had a lower experience than them), and they do have an unusually high number of games (only Undead got close and were still <90%) it seems like as good as they look on paper Orcs just are not cutting it at the current tournament meta even with experienced players. Also, fwiw, very few of the top Eurobowl nations pick Orcs, and I think that this tells us something as well.

Would be interesting to see this repeated as it was a couple years ago now and the meta may have changed again.


But yeah, interesting stuff - please do continue! :D


edit: yeah would really have helped if I had opened the spreadsheet for access! Open now, sorry all.

Reason: ''
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: Examining the NAF meta

Post by VoodooMike »

plasmoid wrote:In the past you have said yourself that including mirror matches in performance/balance would be crazy, and that if the BBRC did so knowingly then they were stupid (or some other derogatory term, I don't remember). Do you not still think so?
When assessing the roster's design against that of the other rosters I don't think they should have included mirror matches, no... but they did, so any comparisons that relate to the BBRC tiers necessarily will require they be included.

The primary issue you're going to run into with ALL these lines of investigation (even when they're done with the right statistical methods) is that all the methods chosen thus far are heavily confounded by composition - people aren't playing all the rosters with equal frequency which means that teams do not face all rosters as opponents with equal frequency. Any data used will be coloured by that environment's demographic composition, which may not be consistent between environments... yet any ideas related to "tiers" or "adjustments" will have been constructed based on the original dataset's demographics. Any flat adjustments applied will be based not on today's composition but yesteryear's... achieving what?

Now, for investigating roster "balance" overall we've been able to correct for composition in our datasets by weighting the contribution to the win rate distributions based on the frequency we see the opponent in the data... but again, that's creating the assumption of the "perfect world" which is good for design, but not particularly good if the concern is performance in the real world... and certainly not good if someone is actually planning to use the numbers to apply a blanket adjustment to real-world play.
dode74 wrote:The second is an attempt to tier the races, which is something which can be done with cluster analysis - something Wulfyn did some time ago via PM with me. While explaining the concept he carried out a k-means analysis of that same data (up to July 2015) and came up with the following
Cluster analysis is a pretty terrible method for tiering, especially if we're using a single dimension (win rates) - single dimensional clustering is wishy-washy (I guess the term wishy-washy is itself wishy-washy), as it produces results very similar to just eyeballing things.

The issue with that form of analysis is that it (certainly in the way you're detailing) arbitrary in nature: you are choosing how many tiers you WANT to have and then finding that many points and assigning members to them based on which point they're closest to. The members of each cluster may be closer to members of adjacent clusters than they are to other members of their own - no effort is made to ensure or even investigate orthogonality. Members from different clusters may not be significantly different from one another, either, in spite of being assigned to different groups. You could ask it for 2 clusters, or 15, and it'd happily churn out results for you.

If you were trying to tier rosters in a way that would have meaning, you'd want to find a set of n (unspecified ahead of time) clusters into which each roster falls naturally and exclusively (that each team's 95CI covered its cluster's centroid is peachy, but it is equally important that its 95CI did not cover any other cluster's centroid too... or that range of any member of another cluster) within its cluster. Only at that point do the specific centroid values have any direct utility... if even then, as we'd still be talking about categories so broad as to stay fairly similar to what the BBRC already set out which, lets be honest, can only reliably be broken down into Tier 1 and Tier 3 when avoiding heavy overlap.

Reason: ''
Image
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Examining the NAF meta

Post by plasmoid »

Hi Wulfyn,
thanks for contributing and for showing your stuff. If you'd like to see my spreadsheet, just let me know where to send it :)
The problem that you have in your analysis is that it is too general to be really useful
Just to clarify, I did one version of the spreadsheet for the EuroBowl data. And then, when I saw how flexible it was, I entered a different set of data, which resulted in the stuff I've described here. In my EuroBowl stuff (which I did crudely on scrap paper before EuroBowl17, and then with the spreadsheet as debrief for 2017 and early prep for 2018) the meta I used consists of the 272 race picks from 2016 and 2017 - including mirror matches. So it is a meta where 5 teams make up more than half of the meta. And the stats reflect that.

Very interesting to look at your spreadsheet.
Can I ask: Just like me, you must run into the "problem" that the CI95 ranges overlap, so the tiers you describe must be more muddy than that. Right?
I mean, I often speculate that correlation between power and quantity of any particular team may be weakened by the fact that outside of the top 5, people tend to disagree which teams are the best. Because the numbers are too close to say for sure.

Also: Where did you get the numbers? NAF data dump? Nurgle and Chaos look very low compared to mine.
As I stated, I've used Doubleskulls site for mine. But I really wish that his site would let me double sort the stats. The site only lets me sort by rulebook or TV, so I go with LRB6+, but then I have to use unsorted data for TV1000, 1100 and 1200. Which isn't ideal. I'd love to have just the LRB6+/TV1100 stats.

As for the Tiers you guys reached, did you try to take the EuroBowl roster bonuses into account? Because it seems to me that those lift Necro out of yout tier 2a...?

Funny thing. Your stats, just like mine, have Orcs way down in the hierarchy. In mine they are behind lots of the EuroTier2 teams. Perhaps Orcs ought to be T2 as well (at the Euros).
That leads to another interesting thing with your meta-analysis, if you change your tiering are you just swapping one meta for another without really balancing things?
Well, what I was trying to do with my non-EuroBowl calculations is to create the ideal meta - based on the assumption that if all teams were close to equal, then teams would also get selected in equal numbers, dissolving the very concept of a "meta". Personally I'd much prefer the tactical challenge of playing in such a tournament, than in one where half the people pay 5 teams.
We saw at last year's NAF how the uptake of the Eurobowl rules meant that we just swapped out a Tier 1 race for Necromantic. Nothing was balanced, it was just swapped.
Yeah, Necros are an edge case in most tournaments that divide into just 3 tiers. Put them in tier 1 with no bonus, and they struggle. Put them in tier 2 and they get perhaps just a touch too much.
But be that as it may. I built me spreadsheet in order to see what would happen to the win percentage of all teams, when particular teams waxed or waned. And to use that to push everyone towards the middle.

Finally - very interesting about Orcs. And humans!
Again I wonder. Were there enough coaches to determine if the difference between the performance of the 2 groups was significant?
Interestingly, the Danish BB community has a long standing tradition of working with tier bonuses. Most notably the All Teams Viable Rules used for many tournaments, including the Danish Open. For the 2017 version of ATV, Orcs got the buff that the NAF stats looked like they deserved. And everybody and their dog was all over Orcs like a bad rash. So the community clearly thought that Orcs had gotten too big a boost. So many orc teams wasn't a desirable situation. But - interestingly - all those orc teams didn't win any tournaments... So, who were wrong - the coaches or the stats...?

Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
Wulfyn
Emerging Star
Emerging Star
Posts: 323
Joined: Mon May 19, 2014 9:33 pm

Re: Examining the NAF meta

Post by Wulfyn »

Hi Plasmoid - always interested in having a look, will PM you my email in the morning!

Firstly apologies that I didn't write the opening paragraph clearly. When I said "you" I meant people in general and not you specifically. It was more me musing on all the problems that there are rather than being critical of your approach. With so much messiness in the data and also the problem of targeting it is hard to come up with a one-size-fits-all system. A similar example is in one of the great conversations I have with mubo, joemanji, and geggster regarding tournament scoring systems. Whilst strength of schedule is the best system that is in common usage it does have some issues in that does a top 20 player really see any significant difference between a rank 175 / 200 and a rank 195 / 200 player? SoS means that the person that plays the rank 175 player is going to be ranked ahead in the tournament tie-breaker, but if the difference in chance of winning in those match ups is effectively zero then is it much good? The closer the player being ranked is to the ability of all players that they play the better the system works, so SoS seems like a much better tiebreaker for a rank 100/200 player than a top 20 player. This can be a real issue when the NAFC gets too big that like last year we end up with 3 players on a perfect record going into the last game and then SoS just becomes luck of the draw in who faced the best opponent in the 1st round random draw.

Similarly I think that tier bonuses have a problem because the stats for the midfield are not going to be the same as the stats for the top end. And that's a difficult problem to solve I think!

Plasmoid wrote:Can I ask: Just like me, you must run into the "problem" that the CI95 ranges overlap, so the tiers you describe must be more muddy than that. Right?
Yes. Firstly the tiers I described were a mix of stats and opinion specifically for EB based on experienced player input and racial selection from top teams. Sample size gets quite low as you'll appreciate! So take that list with a pinch of salt from a stats driven perspective.

This is why when trying to create tiers I prefer a k-cluster analysis over a ranked one. CI intervals just give you some idea of confidence, but really the most confident position is the average, so any tier system based on either of those approaches is quite arbitrary (well any tiering attempt is arbitrary, but that one I think more so than others).

Now some people in this thread try to use the big boy language when it comes to things like k-cluster analysis, but they are only amateurs so they don't understand the proper methodology, confidence checks, or approach to this sort of analysis. Yes it has a degree of arbitrary value to it - but then so does all statistics. So when you create the approach there are certain rules you must follow.

For example you must use a random forest approach. If you just start off with something like choosing k groups and then splitting them into equally sized groups and applying your centroid then you're really just affirming the consequent. So you scramble the races up into the different groups randomly (not even giving parameters like the groups must be the same size) and then let it fly from there. And you do this a ton of times all with different starting positions. In fact the approach I took was to not even set the value of k to begin with (it was also randomly selected between 3 and 10), I just published the result of k=4 and k=5 as I felt that was a practical constraint for tournaments.

Then you are looking for the speed at which the results settle. If it takes some things much longer to reach a concensus then it is an indicator that you have the wrong number for k, because it is an indicator that those groups don't really exist in any meaningful way. That's why I made a specific note around teams that swapped tiers a lot, because it indicated that they didn't fit as well as the others, which in turn is a sign that you should increase k. There are a number of other tests you can do as well that I also mentioned in part in my notes - for example High Elves ended up with 2 centroids in their CI range, which is also an indicator of instability of the clustering and a sign that k should be increased. Methods often seem weak to those that do not understand them properly.

This is why the approach you are taking where you are not so much setting tiers as an a priori venture, but looking at what sort of adjustment would be required is interesting. Of course it is still a difficult venture to quantify those differences in game terms, but is the current tiering meta really working?


Your story about Orcs is particularly intriguing. For me they are solidly within Tier 2. Whilst it is easy to cherry pick things like "they didn't win" as an indicator of success of the tier boost it also does not surprise me.

Finally not sure about the data source, but I'm sure it's out of date now anyway (could have been Doubleskulls, could have been something else). Happy to revisit with updated data tho - interesting to see how things may have changed!

Reason: ''
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: Examining the NAF meta

Post by VoodooMike »

Wulfyn wrote:This is why when trying to create tiers I prefer a k-cluster analysis over a ranked one. CI intervals just give you some idea of confidence, but really the most confident position is the average, so any tier system based on either of those approaches is quite arbitrary (well any tiering attempt is arbitrary, but that one I think more so than others).
With a single dimension k-cluster analysis is ultimately no less arbitrary unless, as I say, it shows completely discrete groupings. If not, the tiers you come up with have no useful meaning because they don't represent groups whose members are significantly different from all the members of the other groups they do not belong to. Without having that as a requirement you can create as many or as few "tiers" as you feel like because hey, why not?
Wulfyn wrote:Now some people in this thread try to use the big boy language when it comes to things like k-cluster analysis, but they are only amateurs so they don't understand the proper methodology, confidence checks, or approach to this sort of analysis. Yes it has a degree of arbitrary value to it - but then so does all statistics. So when you create the approach there are certain rules you must follow.
And some people in this thread try to use what really comes down to NOT-boy language to be circumspect with their ad hominems, though they just come off as a pussy in the process. I mean you, BTW... I wouldn't want to be a pussy about it too ;)

This is not the sort of arbitrary we have in all statistics, this is the sort of arbitrary that leads to the clusters having no meaning due to the fact that no effort is made to make these clusters distinct from one another, only to find the best way to clump things into a specified number of groups. What use are tiers in any practical sense if they are not distinct from one another? Using each cluster's centroid value (and its just one value in a one-dimensional analysis) for anything would be disingenuous... as would implying there is a significant difference between the tiers given the fact that members of different tiers may not be significantly different from one another and, in fact, may be more similar to each other than they are to members of their own clusters.
Wulfyn wrote:Methods often seem weak to those that do not understand them properly.
Unsurprisingly, methods often seem weak when they actually are contextually weak.

Do tell, though... are all members of each "tier" you create using clustering closer to their cluster-mates than they are all non-members? Does only one roster have more than a single centroid within its 95CI? How much 95CI overlap is there between members of adjacent groups? Without having distinctness, the clusters are without utility.

Reason: ''
Image
kyrre
Experienced
Experienced
Posts: 156
Joined: Sun Aug 28, 2016 7:44 am

Re: Examining the NAF meta

Post by kyrre »

sann0638 wrote:Ideally Jip's picture would have been larger. Bit of fun analysis - are you making the raw data available anywhere?
Is it a spreadsheet of all played matches you are after? If so, I have made one available here:

Edit: I have moved the link here: viewtopic.php?f=81&t=44538&p=789413#p789413

Last updated on Saturday, December 2, 2017. It is a csv-file inside a zip. 187955 matches.

I have tried to add classifications for LRB4-6 and swiss scoring. However, the dataset is unstructured with big differences in quality and most of the tournaments fail to record rulesets and how they matched teams. Of the 3297 tournaments, I have yet to determine the rules used for 2162. 851 tournaments seems to be some form of the swiss-system.

Are there official dates for when NAF started using the different rulebooks?

Reason: ''
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Examining the NAF meta

Post by plasmoid »

Hi Kyrre,
nice Work!
Now if someone could set up a way to sort all of these (or some of these, if one wanted to start at a specific date) into win percentages for each race-race matchup, then that would be super helpful!
Anyone?

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Examining the NAF meta

Post by plasmoid »

Hi VoodooMike, thank you for your input.
When assessing the roster's design against that of the other rosters I don't think they should have included mirror matches, no... but they did, so any comparisons that relate to the BBRC tiers necessarily will require they be included.
OK, duly noted. For the record, this is not about the BBRC tiers.
Any data used will be coloured by that environment's demographic composition, which may not be consistent between environments... yet any ideas related to "tiers" or "adjustments" will have been constructed based on the original dataset's demographics. Any flat adjustments applied will be based not on today's composition but yesteryear's... achieving what?
Well, unlike NTBB, this is looking at one particular environment: NAF LRB6/CRP res games, trying to figure out what would make teams equally likely to win a tournament prior to people selecting their teams. The underlying assumption is that if no team is clearly more likely to win prior to team selection, then that will indeed make team selection harder/impossible to predict (and accordingly harder to metagame).
Now, for investigating roster "balance" overall we've been able to correct for composition in our datasets by weighting the contribution to the win rate distributions based on the frequency we see the opponent in the data...
I was intrigued by this, but I'm not sure I'm reading you right, so let me just ask:
1. Is there a proper way to calculate projected winstats for a (indeed any) given meta-composition?
2. If so, is there a proper way to calculate the IC95 range for these projected stats - or would it perhaps be misleading to even have such ranges, given that the stats aren't for actual games played?
Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: Examining the NAF meta

Post by VoodooMike »

plasmoid wrote:For the record, this is not about the BBRC tiers.
I realize that, and that your interest is in tournament balancing "tiers", but the potential remains for teams of a given roster to face other teams of the same roster. You're trying to predict the performance of each roster based on collective win rates, while ignoring a type of match they're likely to play. It's not that you can't do it, it's just not more accurate... though the tournament "tier" system is itself a bad system to begin with.
plasmoid wrote:Well, unlike NTBB, this is looking at one particular environment: NAF LRB6/CRP res games, trying to figure out what would make teams equally likely to win a tournament prior to people selecting their teams.
The thing with tournaments is that the roster demographics are almost certainly going to be far, far more skewed than we see in environments with a larger scale. The lower the number of teams, the more easily those demographics can deviate from any sort of norm. To that end, trying to use normalized distributions with an expectation that they'll result in proper balance in each tournament is... unrealistic. Very much so.
plasmoid wrote:The underlying assumption is that if no team is clearly more likely to win prior to team selection, then that will indeed make team selection harder/impossible to predict (and accordingly harder to metagame).
Yeah I get the idea, and its a noble concept, but you (and everyone here) are using an extremely over-simplistic model to achieve that end, and it isn't going to clean up the mess, just push it around. The basic idea is that you try to find distributions of team win rates, and then shift the centers of those distributions together to "balance" the groupings, but all you do is change which teams have the advantage, and which are at a disadvantage. The larger and wider the distribution, the more the upper end will end up at a global advantage, and the more the lower end will end up at a global disadvantage if you successfully align the groups.
plasmoid wrote:Is there a proper way to calculate projected winstats for a (indeed any) given meta-composition?
You can calculate projected win rates once you know the composition of the environment, but not more accurately that "could be almost anything" prior to that. Trying to use any number beforehand to balance a tournament before you know the composition of that tournament should be disingenuous, but I suspect people just don't understand that it's a fool's errand.
plasmoid wrote:If so, is there a proper way to calculate the IC95 range for these projected stats - or would it perhaps be misleading to even have such ranges, given that the stats aren't for actual games played?
Oh you can make legitimate projected win rates for known compositions based on the data we have, but not that cover unknown demographic distributions beyond 50% +/- 50% (that's hyperbole, but the true numbers would be equally useless).

Now, that's not to say that there's no way to make a tournament balancing system... it's just you'll never make one in the vein of these "tournament tiers" systems that doesn't just change where the bias is located.

Reason: ''
Image
kyrre
Experienced
Experienced
Posts: 156
Joined: Sun Aug 28, 2016 7:44 am

Re: Examining the NAF meta

Post by kyrre »

plasmoid wrote:Hi Kyrre,
nice Work!
Now if someone could set up a way to sort all of these (or some of these, if one wanted to start at a specific date) into win percentages for each race-race matchup, then that would be super helpful!
Anyone?
I can probably do that for you. Give me a few days. Anyone else is welcome to deliver before me of course.

Do you know how they classified swiss scoring in the dataset you used? I am curious if there is some information I am missing on the matter.

Reason: ''
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Examining the NAF meta

Post by plasmoid »

Hi Kyrre,
that would be awesome. I'm sorry that I can't do it myself.
It would be great if you could set it up so that it would be automated, and you (or I, or anyone) could simply edit the csv file to include the games that one wanted to include in the stats.

I don't have any info on which tournaments used Swiss, or which specific tournaments used LRB6/CRP, (though Ians site has the option to include LRB6 games only (this includes CRP))
Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
dode74
Ex-Cyanide/Focus toadie
Posts: 2565
Joined: Fri Jul 24, 2009 4:55 pm
Location: Near Reading, UK

Re: Examining the NAF meta

Post by dode74 »

Of those matches in the csv dataset 143,778 of them don't have a confirmed ruleset. 32986 are CRP, 8,934 are LRB5, 1,906 are LRB4 and 350 are BB2016. If we (questionably) take the CRP, LRB5 and CRP tournaments together then we have 42,270 matches to sort into 576 bins for race vs race data. To give you an idea of how much use that's going to be, Norse and DE are middling T1 teams in terms of games played under CRP/LRB6/BB2016. They played each other 383 times, and that's relatively high. When you get to things like Vamps vs Nurgle you're into a total dataset of 16 matches.

Reason: ''
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Examining the NAF meta

Post by plasmoid »

Hi Mike, thanks for responding.
You're trying to predict the performance of each roster based on collective win rates, while ignoring a type of match they're likely to play. It's not that you can't do it, it's just not more accurate...
We're blessed with 24 "factions" in BB, which somewhat disguised the problem. But I think any PvP game with factions like, say, Overwatch (25?), Hearthstone (9) or Starcraft (3) ought to strive for a good balance between the factions. In this respect I find the information that "faction x never really beats itself" completely devoid of meaning, and this only gets worse of any faction gets played significantly more than average.

On the upside, if all factions (or teams) perform reasonably close to the 50% mark (and that is my ideal), then including or excluding mirror matches will make very little difference. It is only really an issue if one or more factions is overperforming in the first place.
The thing with tournaments is that the roster demographics are almost certainly going to be far, far more skewed than we see in environments with a larger scale.
True. And as long as there are match-ups like Dwarf vs Amazon, individual tournaments won't be completely balanced.
But I think that the meta-game, the game of team selection, can be balanced - i.e. that no team comes in with a substantial advantage or disadvantage already before anyone has chosen their team.
Yeah I get the idea, and its a noble concept, but you (and everyone here) are using an extremely over-simplistic model to achieve that end, and it isn't going to clean up the mess, just push it around. The basic idea is that you try to find distributions of team win rates, and then shift the centers of those distributions together to "balance" the groupings, but all you do is change which teams have the advantage, and which are at a disadvantage.
Not sure I'm reading you right here - but the EuroBowl rules (and lots of Danish tournaments) have showed me that teams can be pushed into different performance tiers. The problem, as you say, is that we may well just push teams around and create a new top/bottom. I tried to adress this by creating the excel sheet that would (to some extent) show the effect on other teams with each performance tweak so that for example, in theory, massively boosting High Elfs wouldn't just send Lizardmen into a tail spin.
You can calculate projected win rates once you know the composition of the environment, but not more accurately that "could be almost anything" prior to that.
What I meant was: Is it possible to project win rates based on global composition, so in essence, given that we know which races have shown up to tournaments for the past 6 years, what the most likely outcome would. That's what I tried to do with what I think you refer to as normalized distribution. I get that no 10 (or 24) team tournament will indeed be normal in distribution, but trying to figure out which team/faction is likely to win (i.e. a stronger or weaker team choice) is still a relevant consideration... In that most attending coaches will consider it.

It is in this regard that I ask, with such a projected/normalized winrate, would it be possible to calculate CI95 ranges, or would that just essentially be humbug (making unscientific projections like more scientifically valid than they are)?
Now, that's not to say that there's no way to make a tournament balancing system... it's just you'll never make one in the vein of these "tournament tiers" systems that doesn't just change where the bias is located.
I do think that strong tier rules can make a World of difference.
Even so, I'm curious to hear your thoughts on how to fix/de-bias tournaments. Would you care to share your thoughts on that?

Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Examining the NAF meta

Post by plasmoid »

Hi Dode,
I know that weird team A vs weird team B will not be backed by a lot of data.
I don't know how Ian kept track of this, but on Ians site (http://naf.talkfantasyfootball.org/lrb6.html) roughly 113.000 games (226.000 results) have been sorted into the category LRB6 (which is essentially LRB6/CRP). The site has a further 36700 games listed as LRB5, which I haven't included in my data, but there were not that many differences between LRB5 and 6. So at least the data situation is slightly less grim, even if we are hard pressed on data for Elf vs Underworld.
Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
Post Reply