Register    Login    Forum    FAQ

Board index » Blood Bowl » New Concepts




Post new topic Reply to topic  [ 68 posts ]  Go to page Previous  1, 2, 3, 4, 5
Author Message
 Post subject: Re: Analysis of NAF statistics
 Post Posted: Wed Mar 08, 2017 10:15 pm 
Super Star
Super Star

Joined: Mon Nov 01, 2010 7:21 pm
Posts: 929
AndeeT wrote:
(EDIT for table jpg. I can't format forum posts for toffee!)

Write the table using a monospace font in other editor, then copy paste and wrap with code tags.

Code:
1 - 2 - 3
a . b . c
i   ii  iii


Top 
 Profile  
 
 Post subject: Re: Analysis of NAF statistics
 Post Posted: Thu Mar 09, 2017 12:44 am 
Rookie
Rookie

Joined: Thu Jul 28, 2016 8:49 pm
Posts: 28
Thanks for the discussion everyone. I have learnt a lot here. And thank you Steam Ball for the lesson on formatting posts- will give that a go later.

The points around how to measure success are extremely interesting. Conceptually, a draw is hard to define for me in this way. It kind of comes down to a glass half full or half empty no? Some will see a draw as a bad thing, others as a good. Of course, this problem is not specific to Blood Bowl but to any game where outcome is trinomial. A few things that I can think of as to why it's useful to consider 'success' to be (wins+(draws/2));

1. It obviously has a long standing in American sports...(never came across it here in the U.K. until now though), so there is a precedent for it at least.
2. Blood Bowl can often end up in a draw
3. In terms of ranking teams, you want a system that discriminates between teams so that you don't end up with too many tied ranks. Taking (draws/2) into account should (I think) increase your discriminating power over straight wins.

Dode - you mentioned on the first page of the thread that low TV bias could confound comparisons. I know that that NAF data can be split out into three different TR categories (100, 110, 120) so I was considering looking into standardising for TR. Has this been done before? Or in the case of FUMBBL, standardising for TV?
If we are considering total games at all TR's, the TR structure of per race could vary a lot so it certainly seems reasonable to look into. Either that or produce a separate analysis for each TV range.

The NAF is already stratified into those 3 categories; what TV bands would you suggest for e.g FUMBBL data?

Oh, and thanks for this from page 4 of the thread;

"I think, therefore, that the "points gained" proportion is (wins + draws/3)/n and the "points not gained" proportion is (losses + draws*2/3)/n, where n is games played."

The formulae check out. And it helps my understanding to dichotomise into 'gained' and 'not gained'; think that's where I was going wrong earlier with the whole 'w/l/d ≠ 1' thing.


Best Wishes,

AndeeT


Top 
 Profile  
 
 Post subject: Re: Analysis of NAF statistics
 Post Posted: Thu Mar 09, 2017 8:23 am 
Ex-Cyanide/Focus toadie

Joined: Fri Jul 24, 2009 5:55 pm
Posts: 2430
Location: Near Reading, UK
The NAF data is stratified for bands in which most(?) tournaments take place. There is some variance in this: some 110 tournaments include skills in that 110, while others do not and add a skill pack on top: this will add a confound.

The problem with attempting to stratify by TV for MM etc is that the numbers of games occuring at higher TVs is so low that the 95CI ranges become very large. "Most" people want to know about high TV data (meaning 2k+) but sufficiently useful sample sizes aren't there for lots of the races.


Top 
 Profile  
 
 Post subject: Re: Analysis of NAF statistics
 Post Posted: Fri Mar 10, 2017 2:23 am 
Rookie
Rookie

Joined: Thu Jul 28, 2016 8:49 pm
Posts: 28
Hi,

Thanks dode. I will have to think a bit harder about how to stratify the FUMBBL data by TV (currently looking at ranked data), as the low numbers will mess with standardisation. I was thinking of the following TV bands for standardisation; 900-1200, 1200-1500, 1500-1800, 1800-2100, 2100-2400. As you say, above 2000 numbers start dropping off rapidly, but with TV bands this wide, as long as I exclude Underworld, Goblin, Halfling from the analysis, small numbers shouldn't be an issue.


I did some work looking into standardising the Win% (wins+(draws/2)) NAF data by TR (TR100, TR110, TR120, as it is presented broken down at; http://naf.talkfantasyfootball.org/). The assumption was that the differing TR values may confound comparison between races, if races had different ratios of games played at TR100:TR110:TR120. By standardising the data by TR, it allows you to compare races, even if the relative numbers of games played at each TR level are different. Please see table below. As you can see, taking TR out of the equation did not appreciably change win%. This means that most teams have very similar ratios of games played at TR100:TR110:TR120 in the NAF, so it shouldn't be too much of a confounding factor when comparing races in NAF data. Of course, we can only standardise by what is in the data; as dode mentioned skill pack upgrades could also be confounding but I can't standardise for this as it is not recorded in the data.

Code:
         RACE,  WIN%,  STANDARDISED, DIFFERENCE
       UNDEAD,  56.3,          56.3,        0.0
     WOOD ELF,  55.5,          55.4,       -0.1
    LIZARDMEN,  53.8,          53.8,        0.0
       AMAZON,  53.6,          53.6,        0.0
   DARK ELVES,  53.1,          53.0,       -0.1
CHAOS DWARVES,  52.5,          52.5,        0.0
      DWARVES,  52.1,          52.1,        0.0
        NORSE,  51.9,          51.8,       -0.1
       SKAVEN,  51.3,          51.4,        0.1
  NECROMANTIC,  51.1,          51.2,        0.1
        ELVES,  50.5,          50.6,        0.1
          ORC,  48.1,          48.1,        0.0
   HIGH ELVES,  48.1,          48.0,       -0.1
       KHEMRI,  48.0,          48.1,        0.1
   CHAOS PACT,  47.6,          47.6,        0.0
        SLANN,  46.4,          46.2,       -0.2
        HUMAN,  46.0,          46.0,        0.0
   UNDERWORLD,  45.1,          45.1,        0.0
       NURGLE,  44.5,          44.4,       -0.1
        CHAOS,  43.7,          43.7,        0.0
     VAMPIRES,  43.3,          43.4,        0.1
    HALFLINGS,  34.5,          34.3,       -0.2
      GOBLINS,  32.4,          32.4,        0.0
        OGRES,  31.7,          31.4,       -0.3


With this in mind, I produced a 'dode graph' of the standardised data. Please see below;

Attachment:
TR.png


For details on standardisation please see page 9 of the following PDF http://www.apho.org.uk/resource/view.aspx?RID=48457

I am also working on a similar thing looking at standardising between LRB4/5/6 of the NAF data.

Please let me know your thoughts.

EDIT - forgot to say, the lines on the graph represent the +\- 95% confidence interval for that races win% (calculated using method on page 9 of the above hyperlinked PDF)

Best Wishes,

AndeeT


You do not have the required permissions to view the files attached to this post.


Top 
 Profile  
 
 Post subject: Re: Analysis of NAF statistics
 Post Posted: Fri Mar 10, 2017 7:01 pm 
Veteran
Veteran

Joined: Fri May 31, 2013 1:52 am
Posts: 184
AndeeT wrote:
First half of 95% confidence interval comparison table;

Attachment:
Confidence_Int_1.jpg

There is a different formula for comparing two proportions. You can say with 95% confidence that two proportions are different even if their 95% CIs overlap.
https://www.cscu.cornell.edu/news/statnews/stnews73.pdf
Provides the formula.

_________________
Image


Top 
 Profile  
 
 Post subject: Re: Analysis of NAF statistics
 Post Posted: Sat Mar 11, 2017 3:50 am 
Rookie
Rookie

Joined: Thu Jul 28, 2016 8:49 pm
Posts: 28
Hi CyberedElf,

Thank you for the PDF and interesting that you mention T-tests and confidence intervals for differences between samples. I used to do T-tests all the time but have never looked into the CI for differences in proportions or means, so that was educational for me. I think these are great for when you set up an analysis specifically looking at a difference between two races. However, I don't know if I would condone this approach for comparing all the races we have.

It is very tempting to calculate these for each of the comparisons between races. However, that is a lot of comparisons! Talking alone about the overlapping pairs of confidence intervals from the graph/tables I posted earlier, there are 95 unique pairs that overlap. By making multiple comparisons like this, each time you make one, you increase the chance of finding a difference purely by error (type 1 family-wise error).

Our alpha=0.05 for making those confidence intervals, and we are making 95 comparisons. Family-wise error for this =1-(1-0.05)^(95) = 0.99! i.e. it is almost 100% certain that we will make an error and find significant differences when there are none.

For the moment I am happy with the conservative method of just using 95% CI's and saying when they don't overlap = significance, and when they do = we don't really know. There are tests that control for family wise error, and things like Analysis of Variance (though I've only ever seen that applied to around 6 comparisons, it might still fall down when we get to comparing 276 racial comparisons??) but I need a few hours sleep and a strong coffee before I start looking into those :-)...


... While I am here though I have a few questions regarding FUMBBL data (http://fumbbldata.azurewebsites.net/stats2.html);
1. How often is data refreshed/updated?
2. What does the 'mirror' button do? what are mirrors? (no silly answers please :-P)
3. Is there a method to extract data for specific, non-overlapping TV bands. Bands I would like are - 900-1119, 1200-1499, 1500-1799, 1800-2099, 2100-2399. However, I think at the moment I may be double counting some matches, as the filters on the FUMBBL website do not allow you to be explicit. Any tips?

Best Wishes

AndeeT


Top 
 Profile  
 
 Post subject: Re: Analysis of NAF statistics
 Post Posted: Sat Mar 11, 2017 8:08 am 
Da Spammer
User avatar

Joined: Mon Aug 12, 2002 10:04 pm
Posts: 23526
Location: Fundamentaling for the BB Illuminati
AndeeT wrote:
2. What does the 'mirror' button do? what are mirrors? (no silly answers please :-P)

Mirror matches are where the same race plays against it self (i.e. Chaos vs Chaos, WE vs WE etc) and has the tendency to pull stats towards 50% (as any given mirror match will either result in a W/l or a d/d).
The button you mentioned removes mirror games so it's always race vs a different race.

Pretty certain you can't remove thse games for the NAF stats, and the BBRC never excluded them anyway.

_________________
SWTC 2017 Stunty Cup winner - never again (until next time!)


Top 
 Profile  
 
 Post subject: Re: Analysis of NAF statistics
 Post Posted: Sat Mar 11, 2017 9:07 am 
Legend
Legend
User avatar

Joined: Sun May 05, 2002 9:55 am
Posts: 5110
Location: Copenhagen
Hi AndeeT,
I did some preliminary data work a few years ago using the same site as you.
Oh, and this nice NAF site: http://naf.talkfantasyfootball.org/ - which unfortunately does not double-sort, but you can at least sort by "LRB6".

Quote:
1. How often is data refreshed/updated?

I believe updating is automated/continuous.

Quote:
2. What does the 'mirror' button do? what are mirrors? (no silly answers please )

It removes matches between 2 teams of the same race, as this is empty data that pulls the data towards 50%.
Removing them is fine if you want to know how much a team wins, but contentious if you want to say anything about "balance" or "tiers" as the BBRC did not exclude mirror matches when they defined the tiers (as Darkson has already hinted).
It is perhaps worth noting that the original data that the BBRC worked with when they defined the tiers did not include the option to remove mirror matches, so perhaps this is why they chose to include data which is not only meaningless, but clouds the meaning. Or perhaps that wanted to include it in their definition, as these games are indeed played in the meta. Nobody has ever bothered to comment, AFAIK.

Quote:
3. Is there a method to extract data for specific, non-overlapping TV bands. Bands I would like are - 900-1119, 1200-1499, 1500-1799, 1800-2099, 2100-2399.

AFAIK, unfortunately no.

Down near the bottom of this page (http://www.plasmoids.dk/bbowl/NTBB2014x.htm) you can see the work that I did.
This is data from FUMBBLs Black Box just before they changed the scheduler. The FUMBBL site used to have an option to see either "pre-" or "post-" scheduler change data, but I don't see that option anymore :(
Anyway, the table has lots of Little cells with numbers in them. If there is a single number in a cell, then I didn't bother to calculate CI, because I was looking for Places where teams went above or below the Tier bands, and it that particular place there was no risk of that. So just ignore cells with a single number in them.
However, for longer bands with 2 numbers (xx.x - xx.x) do include CIs.
...the only reason I bring this up is that I found that for quite a few teams something happens around 1500TV, so that fits will with the divisions that you have chosen.

Good luck
Martin

_________________
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead


Top 
 Profile  
 
Display posts from previous:  Sort by  
 
Post new topic Reply to topic  [ 68 posts ]  Go to page Previous  1, 2, 3, 4, 5

Board index » Blood Bowl » New Concepts


Who is online

Users browsing this forum: No registered users

 
 

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to: