Reconsidering the ruling on Khorne?

News and announcements from the worldwide Blood Bowl players' association

Moderator: TFF Mods

koadah
Emerging Star
Emerging Star
Posts: 335
Joined: Fri Mar 25, 2005 5:26 pm
Location: London, UK

Re: Reconsidering the ruling on Khorne?

Post by koadah »

plasmoid wrote: Anyway, I still find it rich that you roll your eyes at me using CRP stats, when you were the one who demanded that I did.
Was I not allowed to tell anyone that I did?
So, for example, looking at Undead.
If you remove the mirror matches (data chaff) and go with a confidence interval of 95, I get:
NAF tournaments: 55.38 - 57.42 based on over 9000 games
Box between TV0-1500: 55.95 - 57.97, also based on over 9000 games.
I also looked at FOL stats in 2013 for TV0-1500, and got 59.39 - 60.49 (I no longer have the number of games for that, but based on the size of the margin, it looks like more than 9000 games here too).

What is the problem with these stats?
As far as I can tell, there is a balance issue with Undead, and that is all I use the data for. Which I very clearly state on the website, even if you pretend that I don't.
Where on the site do you clearly state that you are only looking at 0-1500TV?
I've heard something like that before but it is easily forgotten. If that was clearer then may be people wouldn't think that you are quite so nuts. ;)

You say that NTBB is built on top of CRP+. About CRP+ you say "CRP+ deals mainly with
the long term advantage of bash". If it is only really a low TV/resurrection ruleset then you need to be a clearer about that.

Reason: ''
harvestmouse
Star Player
Star Player
Posts: 510
Joined: Thu Jan 05, 2012 10:21 pm

Re: Reconsidering the ruling on Khorne?

Post by harvestmouse »

I think I was a bit drunk when I posted my last comment (I guess) as I totally missed the context of the post I was replying to!

Reason: ''
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Reconsidering the ruling on Khorne?

Post by plasmoid »

Hi Koadah,
sorry - I missed this:
I've heard something like that before but it is easily forgotten. If that was clearer then may be people wouldn't think that you are quite so nuts.
:orc:

It's part of the dual nature of my rules.

The PCRP+ rules are not based on any tangible data - or at least not any that I know of. They're based on "Galaks list" of Things considered worth testing. As you quote, they deal mainly with long term play problems, such as the bashy slant (or indeed the ability of any fully developed high TV team to dominate). So the focus is mainly long term (which is not strictly the same as high TV).

The NTBB roster changes are based off CRP data. There-in lies the rub. It gives good information about low TV play, but as the PCRP+ is supposed to alter the meta-game of high TV play, then using high-TV CRP data really isn't very useful. (For example all non-claw bashy teams take a performance nose-dive at high TV in CRP play). So, having committed to using CRP data, I don't trust data above 170TV.

You may(?) remember that NTBB used to have minor nerfs to Orcs and Dwarfs. Those were based on the assumption that the CPOMB nerf would bring these teams back up the ladder. I rather liked those. But I suppose I can't have it both ways. So I took those out when I brought the data in.

So - NTBB isn't a short term ruleset per se. I just has no meaningful data to rely on for high TV - forcing me (for now) to assume that PCRP+ brings down Chaos & Nurgle, leaving room for the other bashy sides, hopefully making room for less survival oriented skill picks and more anti-elf skill picks. I know that's a big 'if'.

You can check out the data here: http://www.plasmoids.dk/bbowl/NTBB2014x.htm

To get back to the undead example: I don't think that lifetime performance is a good measure of balance. With NTBB I target teams that underperform or overperform for an extended stretch of TV (40+ points). Like the Undead at low TV.

Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
Gropah
Experienced
Experienced
Posts: 68
Joined: Tue Feb 17, 2009 6:59 pm

Re: Reconsidering the ruling on Khorne?

Post by Gropah »

plasmoid wrote:
To get back to the undead example: I don't think that lifetime performance is a good measure[...]
Well no, for obvious reasons. :wink:

Reason: ''
koadah
Emerging Star
Emerging Star
Posts: 335
Joined: Fri Mar 25, 2005 5:26 pm
Location: London, UK

Re: Reconsidering the ruling on Khorne?

Post by koadah »

Gropah wrote:
plasmoid wrote:
To get back to the undead example: I don't think that lifetime performance is a good measure[...]
Well no, for obvious reasons. :wink:

LOL.

I'm a bit puzzled though. From a perpetual league point of view what else matters?

Reason: ''
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Reconsidering the ruling on Khorne?

Post by plasmoid »

I think that if a team is super strong early on and weak late -or- weak early and super strong late then that is a balance issue. Lifetime performance can make both look perfectly fine.

...and I know that I'm going against how the BBRC defined balance :)
Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
dode74
Ex-Cyanide/Focus toadie
Posts: 2565
Joined: Fri Jul 24, 2009 4:55 pm
Location: Near Reading, UK

Re: Reconsidering the ruling on Khorne?

Post by dode74 »

plasmoid wrote:if a team is super strong early on and weak late -or- weak early and super strong late then that is a balance issue.
You're equating time with TV. Not all teams have to (or indeed can or should) continually increase in TV over time. Furthermore, "early" and "late" can only be in context of team lifetime: although teams potentially could go on for years, how long do they actually last (games played) in each environment?

Reason: ''
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Reconsidering the ruling on Khorne?

Post by plasmoid »

[This is something I wrote offline - not a reply to Dode]

...just to elaborate: My not being content with the BBRC's definition of balance isn't meant as a critique of the BBRC.
I don't think the BBRC ever quite imagined that there would be enough data to properly examine their definition of balance. Remember, it was created before Cyanide and before FUMBBL showed any interest in LRB6/CRP.

Indeed had they intended the definition for use, they ought to have operationalized their definition. There are simply too many blanks in it.

Are the games giving the tier 1 45-55 win percentage supposed to be a reflection of whichever meta occurs.
In that case, a batch of super powerful could take over the meta, obliterating all other teams to the point of allmost not being played, and between them, these teams could be equal and show up as roughly 50% Winners.
That's certainly not balance.

Are mirrormatches supposed to Count? Mirror matches certainly do occur, and including them would in effect make tier 1 wider than the 45-55%, because all mirror matches pull a race's stats back towards 50. If a race completely took over the meta due to being super ridiculously broken, then mirror matches would give it a huge pull towards 50. Like some of the best teams in the NAF-stats.

Or - if any does not go with whichever meta is created, then one would have to use some sort of average data. But how often would each race then ave to be represented? And how much would the sub-par teams have to be represented? Are they even supposed to be included, or is the 45-55% supposed to be between just the tier 1 teams?

Or - as Dode touches upon - how is team longevity supposed to be factored in?
What if - say - a race is super powerful when developed. And durable. But super weak at the outset. It's stats could be through the roof. But many Leagues just run for a short period of time, and the average career of online teams are just 5 games, so if enough short term teams are played when a team has a win percentage of 30%, then that could easily drown out the few teams that get played long enough to reach zenith and a win percentage of 70%.

All that proably didn't make anybody any wiser.
I'm just saying it isn't really clear what the 45-55% is supposed to mean.

Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Reconsidering the ruling on Khorne?

Post by plasmoid »

Trying to clear my head, while considering what Dode said, I guess that there are different balance issues for the different metas.

But at the Heart of it, I'd say that a team is overpowered, if it can overperform (above 55%) at a Team Value that it can fairly easily maintain.

So, for tournaments, where TV is fixed, it simply means overperforming around TV120. I believe that there are fairly solid data that Wood elfs and undead are overly powerful in resurrection tournaments.

Even for non-tournaments it can be a problem if a team is overperforming at low Team Value. Especially due to short term Leagues, where teams never get a chance to be balanced out by their lifetime-average stat. If you only play a (say) 10 game League then start over, then early power is everything. I get that in this regard TV-based stats are unreliable because as Dode says, low TV and early performance is not neccessarily the same thing.

But still, if you can maintain your TV low (minmaxed semi-developed zons) and fairly easily maintain your team, then you can completely wreak havoc.
This is especially true in huge (online) TV-matched Leagues, where you will be matched up against someone at your TV. I guess this is why Amazons can get such absurd stats in low TV online play. In a League your League mates would develop their teams, and you'd be increasingly relying on inducements - and as we know inducements don't quite bridge the gap.

Both in TT Leagues and online tv-matched play, a race which is strong at high TV and is able to maintain that TV for extended periods of time would be a balance issue. As mentioned above, this might well be hard to spot, due to a popular team (or team(s)) masking eachothers performance, and if using a lifetime stat, then the legions of short term teams could mask it completely.

Teams that are powerful at a certain TV, but have a hard time maintaining that TV, seems to me to be much less of a balance issue.

Cheers
Martin

PS - what constitutes underpowered may well be even trickier to define. Or not.

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Reconsidering the ruling on Khorne?

Post by plasmoid »

To clarify what I meant by lifetime performance above was in reference to the BBRC's definition of balance - i.e. average win percentage across all games played at all team values. And I don't think that tells us that much.

As a bit of reverse Engineering, it is worth noting that the only time that the BBRC ever used the percentages for anything, the data was strictly mined from non-tv-matched, fairly short term play. But other than that it was just heaped together in one big pile, because that's the kind of data we had at the time.

Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
dode74
Ex-Cyanide/Focus toadie
Posts: 2565
Joined: Fri Jul 24, 2009 4:55 pm
Location: Near Reading, UK

Re: Reconsidering the ruling on Khorne?

Post by dode74 »

I believe that there are fairly solid data that Wood elfs and undead are overly powerful in resurrection tournaments.
According to the NAF data published for 2012-13 we can't actually say that is the case:
Image
average win percentage across all games played at all team values. And I don't think that tells us that much.
It tells us the team is balanced within the BBRC's definition of balance. If you want to change the definition then that's a different matter entirely.

I think you've made a lot of assumptions about what the BBRC did and did not do with data. Galak suggests otherwise.

Reason: ''
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Reconsidering the ruling on Khorne?

Post by plasmoid »

Hi Dode,
According to the NAF data published for 2012-13 we can't actually say that is the case:
I was actually hoping to have a reasoned conversation with someone (else?) about what would constitute a good game balance.
But if we do look at the NAF data, we might as well look at the whole thing:
http://naf.talkfantasyfootball.org/
...and then click LRB6.

We have both Wood Elfs and Undead still with a mean above 55%, but now with a lot more games.
That gives us Wood Elfs (95CI = 0.9) at 55.2 - 57.0
And Undead at (95CI = 0.84) at 54.96 - 56.64

So Undead massively more likely to be over the line than not - (given the 0.4 margin and a bell curve structure of the data spread). This using data from Swiss matching. And from an increasing number of tournaments where the top teams are at a disadvantage due to tournament rules. And that's not even considering what mirror matches do to the stats - especially in a Swiss matching system.
I'll assume (perhaps unfairly) that you won't grant me this one. Fair enough.

But Wood Elfs do fall outside of the tier 1 zone. Undead just fall at the very edge.
It tells us the team is balanced within the BBRC's definition of balance.
True. I know what the stone tablets say. Which is why I said "...and I know that I'm going against how the BBRC defined balance"
I was hoping to discuss how to better define balance.
And I gave examples as to why I think the BBRC definition could be improved.
I think you've made a lot of assumptions about what the BBRC did and did not do with data. Galak suggests otherwise.
Thanks. I read that the first time he posted it. In a discussion where Vanguard says that the NTBB has been the most scrutinized and data-crunching ruleset, Galak points out that the PBBL rules were a lot more so.
I completely agree with that Galak that the PBBL rules were scrutinized (venomously) a lot more than NTBB. But I do notice that he then goes on to give an example of that internet scrutiny (slash hate-mail), while not going into details about the data crunching.
I don't doubt that some data was being crunched.
But I highly doubt that the BBRC had access to a secret pool of data of such a magnitude that they could have done more precise balance calculations than what their "current" balance definition allows. If such a data pool existed - or indeed any large pool of new data - then I cannot begin to understand why they have kept the data secret contrary to the interest of the community for all these years. That sounds just a bit too tin foil hat for me.

Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Reconsidering the ruling on Khorne?

Post by plasmoid »

On a side not, I love the graph.
It kind of looks like an Excel sheet or something where you can just whack in the data and the graph just gets generated(?)
Do you have that in an empty version?
Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
User avatar
Fassbinder75
Star Player
Star Player
Posts: 592
Joined: Fri Mar 20, 2009 12:47 pm
Location: Melbourne, Australia
Contact:

Re: Reconsidering the ruling on Khorne?

Post by Fassbinder75 »

The Swiss pairing system will lead to a natural compression of win% (better rosters will end up playing better rosters along with the inverse) so I don't necessarily think the graph is representative of a 'raw' power level.

Reason: ''
minimakeovers.wordpress.com
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: Reconsidering the ruling on Khorne?

Post by VoodooMike »

plasmoid wrote:So Undead massively more likely to be over the line than not..
If you're working with a 95% confidence interval you don't get to say "its close enough that..." or you're not working with a 95% confidence interval.
Fassbinder75 wrote:The Swiss pairing system will lead to a natural compression of win% (better rosters will end up playing better rosters along with the inverse) so I don't necessarily think the graph is representative of a 'raw' power level.
It's true for any environment: without specific statistical corrections applied to deal with environmental composition you're not seeing objective fact, you're only seeing data that is relevant to that environment. The high performance of rosters like Chaos in open matchmaking environments at high TV levels, for example, doesn't necessarily represent them being good at high TVs in general - it may simply mean they're good at high TVs in environments where high TVs are mostly bash teams.

Reason: ''
Image
Post Reply