NTBB: Stats

Got some ideas for rules? Maybe a skill change or something completely different!!! Tell us here.

Moderator: TFF Mods

Post Reply
koadah
Emerging Star
Emerging Star
Posts: 335
Joined: Fri Mar 25, 2005 5:26 pm
Location: London, UK

Re: NTBB: Stats

Post by koadah »

Chaos are down to 46.26% overall in this month's Box ;)

Reason: ''
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: NTBB: Stats

Post by VoodooMike »

koadah wrote:Chaos are down to 46.26% overall in this month's Box ;)
Overall isn't very meaningful for a lot of teams. Amazon and Chaos are two rosters where that is particularly true.

Reason: ''
Image
dode74
Ex-Cyanide/Focus toadie
Posts: 2565
Joined: Fri Jul 24, 2009 4:55 pm
Location: Near Reading, UK

Re: NTBB: Stats

Post by dode74 »

Hi Martin
You know, I read the pdf and the website, and I feel a little better now. I didn't find any claims anywhere that this was based on solid or bulletproof statistics. Didn't find any mention of statistics in fact - except that the BBRC seem to have managed to fit the tier1 teams into their description of tier 1.
From the PDF:
With that in mind, I’ve set out to create simple yet efficient nerfs and buffs for half of the existing teams – hoping to reduce the gap between the tiers.
Not a specific reference to actual numbers, but for those who know what "tiers" means (whether they agree with it or not) then that statement has a meaning in terms of numbers.
Coincidentally Dode, do you really think the stats say absolutely nothing?
No, but I do think that they don't justify what you appear to have justified based on the stats.
I wonder if you could/would work out a graph with margin of error for my old stats?
I can, but not right now. Better would be the FOL+BB+Your stats as the margins would be that much smaller.
In the end I guess I trusted the stats more because they lined up pretty well with what I expected/predicted prior to collecting them. And if nothing else, the NTBB will appeal to those who based on their own experience feel that the stats (and the ensuing discussion) has rightly identified the überteams. I assume we all agree that the tier 2 and tier 3 teams are not disputed(?). Right?
Confirmation bias regarding the "uberteams". Tier 2 and 3 I have no issues with: everything I have seen suggests that they are worse than the Tier 1 teams.
The team lists presented here are my shot at doing just that, first by narrowing tier 1 by essentially weakening the strongest starters. Secondly by buffing the tier 2 teams to the cusp of tier 1, and buffing the tier 3 teams to move them into the current tier 2.
Anyway, what that the aim is to fit the current tier 0, 1 and 2 teams into the 45-55% win bracket, tier 2 teams near the bottom.
And tier 3 teams into their own tier (2), grouped as tightly as possible around the 40% mark, leaving a significant gap up to the bottom of tier 1, and a sizeable distance to the top of tier 1.
Tier 2 and 3 I have no issues with, but your statements regarding Tier 1 are contradictory: are you intending to narrow tier 1 or not?
The website is rather more accurate, with Khemri and Humans described as "somewhat underpowered for tier 1" - and not included in tier 2(!)
I'll ammend the pdf.
Fair enough, but it's probably worth clarifying the reasons for the changes to those teams as you have done here.

Regarding the use of FOL and Box, I think they are both useful. The vast majority of teams play less than 10-15 games in FOL, and I believe the same is true of Box. With that in mind, I agree with Mike that it is useful data.
Overall isn't very meaningful for a lot of teams. Amazon and Chaos are two rosters where that is particularly true.
That rather depends on what you are trying to achieve, surely?

Reason: ''
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: NTBB: Stats

Post by VoodooMike »

dode74 wrote:That rather depends on what you are trying to achieve, surely?
Not really. If you're trying to look at "overall win%" and you see by the data that they win 100% of their games at 1000 TV then drop to 0 at 1500, but it all averages out to 50%, then you're going to be misled into believing that things are "balanced" around the horsecrap 50% concept. In every practical sense, the roster in question would be garbage and totally broken, but the average across all TVs might suggest (erroneously) otherwise. This is why we should probably have both the mean and the median showing if we're examining the overall win%s, and if we notice that they're not lining up fairly well, not foolishly bury our heads in the mean.

In terms of short-term play, we'd want to heavily focus on lower TV matches... for Chaos that'd mean a fairly low win% and for amazons a really high win%. Since amazons get progressively worse over time, and chaos gets progressively better over time (over time meaning... at higher TV levels in this case) all we're doing by looking at the overall mean win% is obscuring the relevant information about the roster at relevant TV levels.

This is also why claiming the design goal is to have the overall win% be in a certain range.. with no additional qualifications. Again, does that make a team that has 100% win percentage at half the TVs and a 0% win% in the other half... a balanced tier 1 team? Perhaps it would be more rational to say things like "this tier has a win% of 45-55% at all TV levels"?

So... yeah, if you're trying to achieve something stupid then I guess it'd be relevant. Otherwise...

Reason: ''
Image
User avatar
spubbbba
Legend
Legend
Posts: 2269
Joined: Fri Feb 01, 2008 12:42 pm
Location: York

Re: NTBB: Stats

Post by spubbbba »

Do we have stats on how many games are played at different TV levels?

I know that a lot of teams in open divisions don’t make it to 20 games. With leagues, some start fresh every season to make it equal and even if they don’t a lot of coaches want a change after a season or 2.

If say only 1% of games are played at 2000+ TV how much attention should be spent on balancing the teams at that level?
Some teams almost never get above 2000, either due to attrition like the elves and stunties or through TV efficiency such as humans are Orcs. With the changes to TV management in lrb5 onwards there are quite a few teams that don’t really benefit from adding skills above a certain level. Gaining or not giving away inducements is worth more to them than taking sub par skills or inefficient re-rolls and subs.

Reason: ''
My past and current modelling projects showcased on Facebook, Instagram and Twitter.
koadah
Emerging Star
Emerging Star
Posts: 335
Joined: Fri Mar 25, 2005 5:26 pm
Location: London, UK

Re: NTBB: Stats

Post by koadah »

spubbbba wrote:Do we have stats on how many games are played at different TV levels?
My Box page is here.
http://www.cmanu.pwp.blueyonder.co.uk/b ... stats.html
spubbbba wrote:If say only 1% of games are played at 2000+ TV how much attention should be spent on balancing the teams at that level?
Some teams almost never get above 2000, either due to attrition like the elves and stunties or through TV efficiency such as humans are Orcs. With the changes to TV management in lrb5 onwards there are quite a few teams that don’t really benefit from adding skills above a certain level. Gaining or not giving away inducements is worth more to them than taking sub par skills or inefficient re-rolls and subs.
Maybe none for plasmoid if he is only interested in the first 30 games. ;)

2000+ may only be a small percentage but for Fumbblers those will include the Majors & League Premier divisions.
i.e. where the trophies get handed out.
I'm taking the TV of the biggest team so 2000 vs 1500 counts as a 2000 game.

Reason: ''
User avatar
Shteve0
Legend
Legend
Posts: 2479
Joined: Thu May 07, 2009 10:15 am
Location: Wellington, New Zealand

Re: NTBB: Stats

Post by Shteve0 »

plasmoid wrote:Right, so one of the accusations here is blasphemy? :wink:
No, it's arrogance. If you don't base your changes - not the content of your changes, or the performance of your changes, but process of determining which teams need to be changed - off real data, you're essentially purporting to be better at game design than the BBRC.
*I disagree that the win-percentage that the tiers are based on should be based on a team's lifetime average. If a team starts very weak and finished very strong, it may well appear balanced over a 30 game stretch, but if you play 60 games, you have 40 games of "very strong" rather than 10. In the same vein - but more importantly - teams that start very strong but have a weak finish will also be perfectly balanced in the CRP-design perception, but to my mind is quite unbalanced, because lots of leagues only play those 1st 10 games and then start over. (and because tournaments are closely related to those first 10 games).
So go get new data. If I want to set up my new TV I'll go and get an instruction manual for my new TV, not use the manual from the one I had ten years ago. This is pretty fundamental - I repeat, I'm not asking you to justify how strong you think teams are, that's pretty obvious.
*I also disagree with the tenet that it is awesome that both tier 2 and tier 3 are so wide. I know this is in contrast to both JJ, (most) of the design team, and a lot of BB players. NTBB is a set of house rules for those that agree with me on this
Pray tell, oh condescending lord of all knowledge, how exactly you have determined how wide tiers 2 and 3 are without the use of data? Can you feel it in the knuckles of your toes? Perhaps intuition?

Edit: ok, I'm cutting out of this. I accept that you're heading to the table with dode, and that's a good thing. I can see several of my points have bridled with you - you apparently don't hear the 3DB guys calling NTBB 'the next CRP' on their show - and neither of us is dealing well with this, so let's just cut it completely and move on. My point is that you need to engage real stats on the current performance of teams if you have any realistic goal of moving towards an end point, which also needs to be specified; and so long as it's clear exactly what you're doing, I have no problem with 'mis-selling'. Tiers are linked to win%; I feel strongly that if you're going to talk about them, you need to be clear on where you want to move them to, how you have determined current %s (and in what format/at what number of games/at what TV), and what % reduction you're aiming for with each team.

Reason: ''
League and tournament hosting, blogging and individual forums - all totally free. For the most immersive tabletop sports community experience around, check out theendzone.co
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: NTBB: Stats

Post by plasmoid »

Hi Dode and Mike,
Not a specific reference to actual numbers, but for those who know what "tiers" means (whether they agree with it or not) then that statement has a meaning in terms of numbers.
I didn't mean that. I meant that I never claim anywhere that I have solid or bulletproof statistics (e.g. games played data) to support the moves I'm making. In fact, I had the exact same data that the BBRC used. Which I felt it was OK to use in the exact same context that the BBRC did.
Now, admittedly, I thought the data was better than it is.
So my point with the above was merely that I wasn't lying.
No, but I do think that they don't justify what you appear to have justified based on the stats.
I get that.
With the 2 specific questions I asked, I was trying to understand how good/bad the stats were.
Neither you nor Mike actually replied. Ah well, just curious.
I can, but not right now. Better would be the FOL+BB+Your stats as the margins would be that much smaller.
FOL+Box+My stats could be interesting :)
Perhaps you could do the quick math to check the range for undead and zons, who seem above tier 1, overall.
Tier 2 and 3 I have no issues with, but your statements regarding Tier 1 are contradictory: are you intending to narrow tier 1 or not?
Well you did quote me quoting myself saying: narrowing tier 1 by essentially weakening the strongest starters
So what I mean is tier 1 should remain 45-55%. I just don't want any "tier 1" teams to be above that. And certainly not in short term play - even if they later drop down into 45-55%.
Make sense?
Overall isn't very meaningful for a lot of teams. Amazon and Chaos are two rosters where that is particularly true.
That rather depends on what you are trying to achieve, surely?
See, this is where Mike agrees with one of the basic tenets of NTBB (or does NTBB agree with Mike)?
I believe that a team is "broken" if it is better than tier 1 in short term play.
Taking the combined 3 sets of data, Orcs, Dwarfs and Wood Elfs aren't broken in "overall" play - but I'd love to have stats for their short term performance. Heck, I'd love to see short term stats for everyone! :orc: Solid stats, mind you.

Orcs and Dwarfs I'd consider nerfing anyway. Not because of any personal preference or dislike. But because NTBB nerfs the number 1 thing that these guys fear (cpomb) - and since I won't be able to generate enough NTBB data, I'm trying to pre-empt what I think is a very likely ripple effect.

[It should be noted that NTBB to some extent adresses the extreme long term by using SE/Bank to lower the TV cap to what the BBRC originally intended (TV 220, roughly). This is because, IMO, the stat heavy teams, like chaos or elves just get better and better as TV rises - and will dominate if allowed to rise too high].

Anyway, what's short term play. I'm spitballing here.
Roughly 8 games? What would that translate to in TV? 1300TV tops?
I think there is a massive difference between an early TV1200-1300 team, and a team that has spent 50 games, to get 6 super skilled players, 5 duds and 1 reroll. Like Minmaxed Pact.
That's why I'd much rather go by "games played" than by TV.
Or to put it differently: Game 1-8 is early play. TV1200-1300 doesn't have to be.

Question is, how much data do we have with both teams in their game 1-8?
And to get more precise stats, do we try to compensate for spread of opponents by weighing each opposing race equally? (I believe Mike has suggested so in the past - which begs the question: How do we weigh tier 2 and 3 teams then? Same as the others?)

Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: NTBB: Stats

Post by VoodooMike »

spubbbba wrote:Do we have stats on how many games are played at different TV levels?
Yes, we know what proportion of games are being played at varying TV levels... and unsurprisingly they cluster heavily around 1000 TV because every team STARTS around that level and then how many games they play determines where the games they contribute to the total end. Does that mean we should balance only around 1000 TV? Most of the games being played at higher TV levels are played by the so-called bashy teams, should we only try to balance THEM at those levels? There has to be consideration given to the current dynamic when looking at that information... because changes made may alter the dynamic. Because the concentrations of data for certain types of teams are affected by the dynamic, there's a very real risk of OVERcompensating for the dynamic and making things go in the exact opposite direction, if you allow the current dynamic to guide you that way.
spubbbba wrote:If say only 1% of games are played at 2000+ TV how much attention should be spent on balancing the teams at that level?
As much as anywhere, I'd say. If you can figure out what is causing a team to perform a given way in a certain TV range, you can apply small changes to compensate for whatever makes it veer away from your target numbers... since you'd need to have an operating theory on that nomatter which TV range you're working on, what reason is there to NOT apply it to all the TV ranges?
spubbbba wrote:Some teams almost never get above 2000, either due to attrition like the elves and stunties or through TV efficiency such as humans are Orcs.
Right, this is what I mean by the dynamic of the current game. You're saying that because some of the teams never get above 2000 we shouldn't worry about 2000+.. and I'm saying that the changes we apply may change that fact - the fact is only true because of the interplay of existing design decisions, and we're looking at reconsidering the design decisions... so we probably shouldn't let the current interaction control our decisions, when we don't know if the resulting interaction will be the same.
spubbbba wrote:Gaining or not giving away inducements is worth more to them than taking sub par skills or inefficient re-rolls and subs.
Great theory.. do you have any numbers that support it, though? The only thing we've seen the numbers support, as far as TV difference, is that the larger the gap between two teams' TV, the more likely the higher TV team is to win. That suggests inducements are not particularly good at bridging the TV gap (which we assumed anyway) but are teams that just can't compare with other rosters past a certain TV what we want? More inherent TV makes them win less.... facing higher TV teams and getting inducements... makes them win less. That just means they might as well never interact with anything past that TV. If we don't like that then we want to make that not true anymore, right? To do that we'd need to look at where they start to falter, and figure out why they do.

Ultimately, we need a firm idea of what we mean by "balanced around..." in terms of a number. I don't think it makes a lot of sense to just aggregate all data and take the mean and say "ta da, job well done, us..." if it hits 50%, because that doesn't really tell you anything about ANY particular games, as I mentioned above. Its not hard to have the mean by any number you choose, if you have an nigh-unlimited floor or ceiling. Ideally, if you said that a team was "balanced around 45-55% win%" then the graph of their win% should be, for MM play, a nice, mostly-straight line that crosses the graph in that range.... and for non-MM play, it should be a slightly curving line that starts below the bottom of the range, and curves up as TV increases, ending up above the range in the highest TV ranges, crossing the middle of their range at the average TV for all teams in all the games of our dataset.

Reason: ''
Image
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: NTBB: Stats

Post by VoodooMike »

plasmoid wrote:With the 2 specific questions I asked, I was trying to understand how good/bad the stats were.
Good and bad? By what metric? Ripeness? Flavour? Have they grown mold yet? Do they engage in premarital sex? You're "specifically" asking for non-specific information.
plasmoid wrote:Neither you nor Mike actually replied. Ah well, just curious.
Really? I thought the first part of my first posting was a reply to you on your "margins" questions.
plasmoid wrote:FOL+Box+My stats could be interesting
Perhaps you could do the quick math to check the range for undead and zons, who seem above tier 1, overall.
The volume of data you collected is going to be like pissing into the wind compared to the volume of data that Box OR FOL represent. I wouldn't assume any useful effect on the numbers by adding your data into them. If you're thinking we'd average the results, rather than combine the individual datapoints then you're on drugs - that's not how it works. That'd be saying that every game you collected results from is as important as 100, or 1000 games played elsewhere... and it'd really mean that you're asking for margins of error on three datapoints (the overall stats for each source) which would give you a sweet range that goes from 0% to 100% for everything.
plasmoid wrote:Anyway, what's short term play. I'm spitballing here.
Roughly 8 games? What would that translate to in TV? 1300TV tops?
I think there is a massive difference between an early TV1200-1300 team, and a team that has spent 50 games, to get 6 super skilled players, 5 duds and 1 reroll. Like Minmaxed Pact.
That's why I'd much rather go by "games played" than by TV.
Or to put it differently: Game 1-8 is early play. TV1200-1300 doesn't have to be.

Question is, how much data do we have with both teams in their game 1-8?
And to get more precise stats, do we try to compensate for spread of opponents by weighing each opposing race equally? (I believe Mike has suggested so in the past - which begs the question: How do we weigh tier 2 and 3 teams then? Same as the others?)
There's no reason to try to subdivide the data like this and, in fact, it would work against you to do so - reducing the power of the model substantially would automatically increase the error (or "margins of error", and you and dode like to call them). You'll be reducing the power for no good reason. This has been said before. You keep ignoring it and restating the request.

Is this an emerging pattern: math by intuition?

Reason: ''
Image
Hitonagashi
Star Player
Star Player
Posts: 664
Joined: Mon Mar 07, 2011 5:11 pm

Re: NTBB: Stats

Post by Hitonagashi »

Mike, how can you cope with the fact that there could be a statistically significant amount of games played by a min-max build type at a certain TV range?

For example:

Using Koadah's Box data, between 1100 and 1300, there are 4k games played by CD's, and 3.5k played by Pact (and 4k by Chaos, the most popular team in the division).

You go up to 1600-1800, and you have 6k by chaos...and while CD's remain high at 5k, Pact are down to 2.1k.

In the Box, teams sweetspot. By dividing the teams by TV at all, you introduce the margins of error you are worried about. I don't think limiting the model to better reflect TT environments, rather than vagrancies introduced by automatching TV games is a bad thing.

Reason: ''
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: NTBB: Stats

Post by plasmoid »

Hi Mike,
Good and bad? By what metric? Ripeness? Flavour? Have they grown mold yet? Do they engage in premarital sex? You're "specifically" asking for non-specific information. [snip] Really? I thought the first part of my first posting was a reply to you on your "margins" questions.
Actually, I was 'specifically' asking 2 specific questions. I didn't 'ask' about good or bad. And while you have replied to the´questions, you certainly haven't given an answer. But never mind - I suppose the short term stats is what's interesting.
The volume of data you collected is going to be like pissing into the wind compared to the volume of data that Box OR FOL represent. I wouldn't assume any useful effect on the numbers by adding your data into them.
And yet, as you say, why would I want to reduce the power of the model for no good reason?
If you're thinking we'd average the results, rather than combine the individual datapoints then you're on drugs - that's not how it works. That'd be saying that every game you collected results from is as important as 100, or 1000 games played elsewhere...
Check back to the previous page, 3rd post from the bottom.
You'll notice that your suspicions were unfounded.
There's no reason to try to subdivide the data like this and, in fact, it would work against you to do so - reducing the power of the model substantially would automatically increase the error (or "margins of error", and you and dode like to call them). You'll be reducing the power for no good reason. This has been said before. You keep ignoring it and restating the request.
You say there's no sweetspotting (in Box). I disagree. But be that as it may.
Either way, I'm looking for data points to simulate short term play.
I can't see how adding data for long term games improves the power of the model for short term play. Quite the opposite in fact.
It would be decreasing the error (or whatever the term is) by deliberate deception.
It's like adding in the guys when doing statistics on ovarian cancer. Gives more data points, but nonsense data points.

Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
koadah
Emerging Star
Emerging Star
Posts: 335
Joined: Fri Mar 25, 2005 5:26 pm
Location: London, UK

Re: NTBB: Stats

Post by koadah »

VoodooMike wrote:
dode74 wrote:That rather depends on what you are trying to achieve, surely?
Not really. If you're trying to look at "overall win%" and you see by the data that they win 100% of their games at 1000 TV then drop to 0 at 1500, but it all averages out to 50%, then you're going to be misled into believing that things are "balanced" around the horsecrap 50% concept. In every practical sense, the roster in question would be garbage and totally broken, but the average across all TVs might suggest (erroneously) otherwise. This is why we should probably have both the mean and the median showing if we're examining the overall win%s, and if we notice that they're not lining up fairly well, not foolishly bury our heads in the mean.

In terms of short-term play, we'd want to heavily focus on lower TV matches... for Chaos that'd mean a fairly low win% and for amazons a really high win%. Since amazons get progressively worse over time, and chaos gets progressively better over time (over time meaning... at higher TV levels in this case) all we're doing by looking at the overall mean win% is obscuring the relevant information about the roster at relevant TV levels.

This is also why claiming the design goal is to have the overall win% be in a certain range.. with no additional qualifications. Again, does that make a team that has 100% win percentage at half the TVs and a 0% win% in the other half... a balanced tier 1 team? Perhaps it would be more rational to say things like "this tier has a win% of 45-55% at all TV levels"?

So... yeah, if you're trying to achieve something stupid then I guess it'd be relevant. Otherwise...

I'd say that you are right when it comes to the box. I'm not so sure when it comes to leagues though.
if you play 10 games, 50 or 100 games the overall is still relevant whether you grew to 2000 or couldn't/wouldn't go past 1200 TV.
That's how leagues roll. If a team is unable to maintain TV the data should reflect that. If a team is better off sweet spotting then it should reflect that too.

The box data is specific to the box so it does seem logical to me to strip out data (at least for some queries) that doesn't fit your target environment.

In a league 50 game min/max zons will not get a steady diet of rookie teams to devour. If their min/maxing works they'll make the top division and be going against the 2000+ chaos & 1800+ dwarves.

Environment matters IMO.

Reason: ''
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: NTBB: Stats

Post by VoodooMike »

Hitonagashi wrote:Mike, how can you cope with the fact that there could be a statistically significant amount of games played by a min-max build type at a certain TV range?
I don't need to cope with unsupported speculation... or are you going to show me evidence that the rosters you feel are prone to minmaxing of the sort in question (long-term optimization) display significant upward spikes in their win% in these TV ranges? No? Bueller? Anyone? Right... that's what I thought. The numbers don't support the theory, which suggests the so-called "statistically significant" problem of such min-maxing, is more perception than reality.
Hitonagashi wrote:In the Box, teams sweetspot. By dividing the teams by TV at all, you introduce the margins of error you are worried about. I don't think limiting the model to better reflect TT environments, rather than vagrancies introduced by automatching TV games is a bad thing.
I'm not worried about the "margins of error", I'm pointing out that by losing power they'll end up wide enough that they overlap the expected win%s for the various tiers... which will mean there will be zero evidence to support making any changes whatsoever. That's what I'm warning plasmoid about - that losing power is going to result in the numbers NOT supporting the need for NTBB whatsoever.
Plasmoid wrote:And yet, as you say, why would I want to reduce the power of the model for no good reason?
You're down to playing buzzwords. The point is that there will be no change in the results by adding your data to it... by all means go do it, and take a look at the effect it has on the results (basically zero). An analogy would be having the average score of 1000 students in a school district, and someone saying "but wait, Suzy here scored really well... lets put her score in the mix and see if that changes things!". Do it, but thinking that it'll make things "interesting" just suggests you don't understand how aggregation works.
Plasmoid wrote:You say there's no sweetspotting (in Box). I disagree. But be that as it may.
The sweetspot concept is like seeing shapes in the clouds. All teams win%s will approach zero as TV approaches zero... this is simple deduction. Unless the win% raises steadily as TV increases, then the roster has to have peaks. I'm unconvinced they're meaningful, especially since not every roster has such a pattern.
Plasmoid wrote:I can't see how adding data for long term games improves the power of the model for short term play. Quite the opposite in fact.
It would be decreasing the error (or whatever the term is) by deliberate deception.
It's like adding in the guys when doing statistics on ovarian cancer. Gives more data points, but nonsense data points.
I love how people start off by saying "well, you know better than I do on this topic" and end up saying the exact opposite. Riddle me this, plasmoid... if people run the lower power statistics and it is shown that the win%s for rosters do not support any alteration in order to bring them into line with the tiers they're placed in, or even that the lower tiers 95% confidence range extends higher than their tier would normally restrict them to... are you going to abandon NTBB? We both know the answer is "no". You're looking for ways to cook the numbers to support what you already think, not to use the numbers to guide your actions... that's not statistics, that is deliberate deception. You already asked about lowering the confidence interval... you're looking for confirmation of what you think, not information on what is actually happening.
Koadah wrote:In a league 50 game min/max zons will not get a steady diet of rookie teams to devour. If their min/maxing works they'll make the top division and be going against the 2000+ chaos & 1800+ dwarves.
Again, this doesn't matter for the sake of the data. If we're talking about how teams perform on a per-match basis, in some specific TV range, then I guarantee that Box/MM data is going to be what we want. Why? Because we already know that there is a major effect on win% realated to TV difference. Nobody is looking to revisit inducements as far as I know, certainly not in this case... so the only thing adding in TV difference is going to get us is a whopping source of error, and wider, less predictable TV differences is what you'll see with non-MM data.

I do think Amazons are erroneously declared to be "minmaxing" just by taking the logical first skill on linemen. It's not minmaxing, its just that Amazons are a bad roster. Really, any roster that puts Block or Dodge on linemen, without any major negatraits to balance it out, is going to be bamboo under people's fingernails at low TV.
Koadah wrote:Environment matters IMO.
Could be... got numbers? Other than increased variation (which is a nice way of saying statistical error) I'm not sure why we'd see different ultimate results, even if we could collect the same amount of data in that very specific environment, which we cannot.

Reason: ''
Image
koadah
Emerging Star
Emerging Star
Posts: 335
Joined: Fri Mar 25, 2005 5:26 pm
Location: London, UK

Re: NTBB: Stats

Post by koadah »

VoodooMike wrote:if people run the lower power statistics and it is shown that the win%s for rosters do not support any alteration in order to bring them into line with the tiers they're placed in, or even that the lower tiers 95% confidence range extends higher than their tier would normally restrict them to... are you going to abandon NTBB? We both know the answer is "no". You're looking for ways to cook the numbers to support what you already think, not to use the numbers to guide your actions... that's not statistics, that is deliberate deception. You already asked about lowering the confidence interval... you're looking for confirmation of what you think, not information on what is actually happening.
If we run the stripped down stats and they show that the rosters is fit the tiers then that would be good enough for me.

If we are concentrating on early games then is there any evidence that even the CPOMB changes are necessary/desired?

If we leave the longer term data in then it looks to me as though orcs need a buff not a nerf.
Isn't the longer term data the 'overall' that you objected to earlier?. Do you mean longer term low TV only?
VoodooMike wrote:
Koadah wrote:In a league 50 game min/max zons will not get a steady diet of rookie teams to devour. If their min/maxing works they'll make the top division and be going against the 2000+ chaos & 1800+ dwarves.
Again, this doesn't matter for the sake of the data. If we're talking about how teams perform on a per-match basis, in some specific TV range, then I guarantee that Box/MM data is going to be what we want. Why? Because we already know that there is a major effect on win% realated to TV difference. Nobody is looking to revisit inducements as far as I know, certainly not in this case... so the only thing adding in TV difference is going to get us is a whopping source of error, and wider, less predictable TV differences is what you'll see with non-MM data.
I didn't think that we were looking at a TV range for NTBB. I thought it was number of games. Some teams will grow some won't. If zons min/max but can't beat the bigger teams then do they really need a nerf? I mention the ranges because that is what I have to hand.
VoodooMike wrote: I do think Amazons are erroneously declared to be "minmaxing" just by taking the logical first skill on linemen. It's not minmaxing, its just that Amazons are a bad roster. Really, any roster that puts Block or Dodge on linemen, without any major negatraits to balance it out, is going to be bamboo under people's fingernails at low TV.
I don't think many people do call that min/maxing. Retiring players who get a 2nd skill to keep TV low is what most are on about I think. Going 10/11 line girls to get blodgers for 70k instead of 90k for blitzers. One POMB blitzer is enough you will only ever play rookies.
VoodooMike wrote:
Koadah wrote:Environment matters IMO.
Could be... got numbers? Other than increased variation (which is a nice way of saying statistical error) I'm not sure why we'd see different ultimate results, even if we could collect the same amount of data in that very specific environment, which we cannot.
In the Fumbbl [L]eague division I see amazons 52% overall and still only 56.25% under 1200tv. compared to 59,38 & 62.89 in Box. And another shocker 59.99 & 63.16 in ranked. Cherry picking worse than min/maxing? ;)

League will feature restrictions & house rules that make it an unfair comparison but I'm still surprised. Very.

Reason: ''
Post Reply