Mathammer of Kill Points - Do They Balance Out Objective Missions?

Lord Inquisitor
21-11-2009, 21:50
Yes, it's another Kill Point thread. I hope this one shows something new, so bear with me.

We've had a thread (http://www.warseer.com/forums/showthread.php?t=230121) on whether KP's are a good mission, and whether they are "balanced." Clearly, on average, the smaller army has an advantage. The pro-KP side will contend that this is the case, but it balances out an equal and opposite advantage for large-unit-count-armies in objective missions, they will have more units to take or contest objectives with, and therefore most armies will gravitate to the same number of units, balancing out the mission.

Theory

This is largely a repost from the previous thread, so feel free to skip this section if you've gone through this (or if you hate algebra!)

Assuming you have two players, X and Y. X has x units, Y has y units and we can assume these are not equal and x>y. X scores b KPs and Y scores a KPs. In order for X to win, b>a. But in order to do so, he must kill the fraction b/y of the enemy army. So assuming that ca=b (where c is a constant), in order to draw a KP mission:

ca/y = a/x

=> c = y/x

Conclusions: player X must kill y/x more of the enemy army than Y.

Now if we look at the objective mission (where things get more fuzzy): Player X has x scoring/contesting units and Player Y has y scoring/contesting units (and x>y again). There are z objectives. Player X loses a units and player Y loses b units. Assuming each unit can only capture/contest 1 objective each (big assumption!), that all units can contest all objectives (another big assumption) and, for simplicity's sake, that one player will win if he has at least one less untaken objective than his opponent, then the number of untaken/uncontested objectives U(x) or U(y) are:

U(x)= (x-a)-z (min 0)
U(y)= (y-b)-z (min 0)

U(x)-U(y)>1 for Y to win
So ((x-a)-z)-((y-b)-z)>1

This can only happen if x-a<z and y-b>z

Assuming that both suffer equal proportions of losses, (a/x and b/y), then x-a>y-b therefore given the assumptions above, yes, X will be expected to win but only if x-a>z. Therefore the optimum number of units is the minimum possible to leave z scoring/contesting units surviving the game, less if they can (a) take more than one objective or (b) they can move freely about the board and the enemy cannot.

Conclusion: an army with a larger unit count has an advantage in objective missions but only a) if the enemy has less than z units by the end of the game, (and remember that on average z=3and z is reduced by 1 for each enemy unit that is in range of multiple objectives); and b) if the numerically superior army has free reign to move about the board.

Statistics

So now to test it!

So, the predictions are:

In Kill Point missions, the smaller army is expected to have an advantage.
In Capture and Control missions, z=2, the advantage to the larger army is negligible, so neither is expected to have an advantage.
In Sieze Ground missions, the larger army is expected to have an advantage.

In order to test the predictions, I took a random sample of battle reports, namely the first 10 pages of the Warseer Battle Reports Forum (as they were November 19th). Each battle report was scrutinised and the following data recorded:

Mission Type
Number of Objectives
Number of units in each army (taking combat-squads or combined infantry squads into account)
Outcome of game
Whether the larger or smaller army won.

Battle reports were screened and only battle reports that met the following criteria were included:

A detailed list of both armies' units
Standard 40K missions (no planetstrike/ard boyz/etc)
An unequal number of units between the armies
Clear, online written format (I excluded all video reports, for example)

Obviously this screening was done before looking at the results, and needless to say all eligible reports were included. Edit: Also, to prevent bias, only one report per player was used (as much as I could tell). So the results weren't biased by a particular player submitting multiple reports with the same army (where the player skill is not independent from the army size). If a player submitted multiple reports, the first eligible one was used.

Results are attached as an excel file [edit: text file as I'm not allowed to attach excel files, if anyone wants the original .xlsx file pm me]. To summarise:

Sieze Ground
Total Games = 18
Won by smaller army = 7
Won by larger army = 7
Tie games = 4
Chi-square = 0
P = 1

Capture and Control
Total Games = 24
Won by smaller army = 9
Won by larger army = 7
Tie games = 8
Chi-square = 0.57
0.5 > P > 0.1

Annihilation
Total Games = 38
Won by smaller army = 24
Won by larger army = 11
Tie games = 3
Chi-square = 15.36
P < 0.001

I did a simple Chi-squared goodness-of-fit, using expected values = number of games played / 2 (ignoring draws). I.e. testing to see if the number of wins/losses for the smaller player versus the larger player deviated significantly from the expected 50-50.

Conclusions.

Sieze Ground and Capture & Control missions did not show any significant deviation from random, although C&C did show a slight trend towards the smaller-unit-count player (quite plausible, as small, elite, fearless armies may have an advantage in holding and keeping a small number of objectives). Annihilation missions showed the expected and highly significant advantage to the smaller player, with over double the number of games being won by the smaller side. Two games were noted by the authors as being a VP win for the side that lost by KP.

So: is there an advantage to high-unit-count armies for objective missions that counterbalances the clear advantage for small armies in annihilation missions? Answer: no.

Netfreakk
21-11-2009, 22:14
I don't think you can do the statistics in this fashion as there are too many variables.

Difference in player skills.
Terrain favoring different armies.
Deployment types favoring different armies.
etc etc...

If it were the same 2 people playing on the same map with the same 2 armies gathering information then swapping and gathering more information then yea I can see the statistics of it, but a veteran playing a complete noob isn't going to give you good reliable results.

But that's just my opinion

Ambull Tau
21-11-2009, 22:20
I don't think you can do the statistics in this fashion as there are too many variables.

You're kind of missing the point. Terrain should, ideally, be a balancing thing anyway, as are the other variables you add. (Which aren't that variable.)

Breaking down into raw math's is useful, and issues like terrain and deployment are highly dependant on user skill.

Deetwo
21-11-2009, 22:25
Good show, though the pink does hurt my eyes a bit in such big volume :D

Anyway, did you just do the comparison based on total unit count on the objective missions or the difference in number of scoring units?
If a smaller army has the same amount of scoring units (which is mainly an issue with the Ork codex anyway) as a larger army, quite a bit of benefit is negated in objective missions.

Also, are all of these just the barebones BRB missions, where a difference of one 1 is a win with no variation?

Necro Angelo
21-11-2009, 22:26
although this does make you sound very clever Netfreakk is kind of right; there are too many confounding variables.

Lord Inquisitor
21-11-2009, 22:37
I don't think you can do the statistics in this fashion as there are too many variables.
Technically speaking, there is one variable (mission played).

Yes, there are other things that could affect the outcome, but these do not matter providing they are independent from the main variable. Unless player skill/terrain/deployment types are not random with regard to the mission being played, then this should not affect the outcome.

If you can think of any reason why the amount of terrain should correlate with mission (e.g. citifight missions, which were excluded from the analysis), then this does not affect the outcome, particularly since there's a relatively large sample size and these things should average out nicely.

Deetwo
21-11-2009, 22:41
since there's a relatively large sample size and these things should average out nicely.

Honestly, 80 is a VERY small sample size.

Lord Inquisitor
21-11-2009, 23:35
Honestly, 80 is a VERY small sample size.

It is? :eyebrows: Based on what, exactly?

Sample size is generally desired to be large because it affects power, but if you're sample size is big enough to detect a significant difference, then the power is high enough! I have a highly significant difference at the alpha=0.05 level. Usually, scientists will go for the lowest sample size that gives the required power level, just because it saves time and resources! And it took me long enough.

The sample size is plenty. If you want to trawl though another 10 pages be my guest, I'll happily run the numbers, but it is extremely unlikely to affect the result.

noobzilla
21-11-2009, 23:46
Very well thought out. Very nice read.

Meriwether
22-11-2009, 00:21
Would it make you sad to know that there are enormous flaws in your methodology (specifically, controlled variables across games *and* a lack of a control -- armies of equal size playing missions -- to compare with) and analysis (especially with regards to causative vs. correlative data), sample size (chi^2 analysis is basically useless on sample sizes under 250, generally inadequate up to, say, 1000, and only start to become 'good' at over 2000 -- from a physicists perspective, anyway)...

My usual problem with scientific studies is an expression of confidence that the data do not support. This study has that problem. Bravo for taking a step in the 'scientific analysis of claims' direction in answering this question, though.

CrownAxe
22-11-2009, 00:25
80 is too small

based on the formula for sample size for a desired margin of error, which is {[(z*)^2(p-hat)(q-hat)]/ME^2}, using a margin of error of 5% and 95% confidence, your would need a sample size of 385

and that should be per mission

Bunnahabhain
22-11-2009, 01:58
Very nicely laid out.
Prediction, Data, Conclusion, as it should have.
-----------------------------------
In order to test the predictions, I took a random sample of battle reports, namely the first 10 pages of the Warseer Battle Reports Forum (as they were November 19th). Each battle report was scrutinised and the following data recorded:

* Mission Type
* Number of Objectives
* Number of units in each army (taking combat-squads or combined infantry squads into account)
* Outcome of game
* Whether the larger or smaller army won.

Battle reports were screened and only battle reports that met the following criteria were included:

* A detailed list of both armies' units
* Standard 40K missions (no planetstrike/ard boyz/etc)
* An unequal number of units between the armies
* Clear, online written format (I excluded all video reports, for example)
--------------------------------
Good criteria here. Recording numbers of scoring units per side would also have been useful.

80 is a small sample, but given you have taken reasonable steps to unbias the data- excluding multiple reports from one army being the main one- it is adequate.

Lord Inquisitor
22-11-2009, 02:17
Good show, though the pink does hurt my eyes a bit in such big volume :D
Thanks and apologies on the pinkiness. :D

Anyway, did you just do the comparison based on total unit count on the objective missions or the difference in number of scoring units?
Just the unit count. Of course, the scoring unit count is no doubt a factor. However, KP missions do not differentiate (excluding variants like the Ard Boyz missions) between Troops and non-Troops, and the pro-KPers maintain that even if scoring units are equal, more units is an advatnage (which it probably is).

If a smaller army has the same amount of scoring units (which is mainly an issue with the Ork codex anyway) as a larger army, quite a bit of benefit is negated in objective missions.
Agreed, and I'm rather wishing I recorded that data too, but for the purposes of the argument, pure unit count was what I was interested in.

Also, are all of these just the barebones BRB missions, where a difference of one 1 is a win with no variation?
Yes, absolutely.

Would it make you sad to know that there are enormous flaws in your methodology
It would make me a sad panda. :cries: I think "enormous" might be a little harsh here... ;)

(specifically, controlled variables across games *and* a lack of a control -- armies of equal size playing missions -- to compare with)
I disagree that lack of controlled variables is necessarily an issue - providing there isn't anything that correlates with mission type or army unit count differential, I don't see why it matters. The only thing I can think of is perhaps I should have controlled for total points value (i.e. just selected 1500 points battles) but that would have taken forever and I see no reason why it should not be independent.

I think you're thinking too much like a physicist. In biology you can never control every variable, so providing the other variables are independent, then they shouldn't be an issue.

As for a control - damn you are right! I'm such an idiot. I disregarded all those batreps that had equal values of units instead of putting them as a seperate table to use as a control. Durr.

Still, given I'm doing Chi squared against ideal 50-50 distribution, it wouldn't affect the result except as a validator so it should come out as nonsignificant - anything else is distinctly unlikely. Now, if I did something more convoluted (see below) it might be worth going through and digging those control batreps out of the 10 pages I went through!

and analysis (especially with regards to causative vs. correlative data)
You're going to have expand on this, not sure what you mean.

sample size (chi^2 analysis is basically useless on sample sizes under 250, generally inadequate up to, say, 1000, and only start to become 'good' at over 2000 -- from a physicists perspective, anyway)...
Physicists are far too picky. :p Besides, it only increases the chance of a Type II error (see below).

Really, I was looking for a back-of-the-envelope statistical support for the data which is pretty obvious just by eyeballing it. What I was considering doing was a three-factor ANOVA possibly with a block for the points value of the game - but this would require trying to massage the data into a respectable paramatric form with replication (each page of batreps could be a replicate, for example, which is why the data is broken up that way) although I would probably lose power in doing so.

I may try doing so (I guess it's good for me to see if I can remember how to pursuade SPSS to calculate it for me), but realistically, the chi-squared is plenty good enough to show us what we can tell just by eyeballing the data.

My usual problem with scientific studies is an expression of confidence that the data do not support.
I can phrase it in 'proper' scientific language "this study empirically supports the proposition that kill point missions favour the smaller unit-size army and finds no evidence that there is a counter-balancing advantage to the higher-unit-strength army in objective missions yadda yadda yadda."

Come on Meri, allow me the luxury of posting definite conclusions. Honestly, the trials of peer-review...

based on the formula for sample size for a desired margin of error, which is {[(z*)^2(p-hat)(q-hat)]/ME^2}, using a margin of error of 5% and 95% confidence, your would need a sample size of 385

and that should be per mission
Okay, I don't have my stats book to hand so I'm not too sure where you got that equation from or what each term means. I know most parametric analyses of power also require also an estimate of variance and a minimum detectable difference (which can make estimating power far more subjective). There are many power analysis functions on calculators or programs that base the minimal detectable difference on the actual detected difference, making the whole thing entirely meaningless. I'll see if I can figure out how to do a power analysis on a chi-squared, if you have a link to a source for that it'd be great.

Still, it is of little importance, because it only affects the probability of gettting a type II error. So, maybe my sample size is insufficient to detect a significant difference for the C&C or Sieze Ground missions, so I can't really say it couldn't be so, and indeed I would not be suprised if there were a (tiny) difference. But there's definitely a significant difference for the annihilation mission. Nevertheless, it is very clear just from eyeballing the data that it isn't anywhere near balancing the effect from KP.

If I want to really nail this down I'll do an ANOVA, and I know I can do a decent power analysis for that. The thing is that even if there was a significant difference for either of the other two, it doesn't mean it is equal in magnitude.

CthulhuDalek
22-11-2009, 02:36
Bravo at sifting through all the raw data!

However, the thing that is not being factored here is whether or not the larger killpoint armies were considering that they might be playing killpoints as well as objectives missions. This could very well be indicative that the transition to killpoints has not sunk in for players with larger army sizes who refuse to change tactics. That doesn't mean that *is* what the case is, but it isn't clearcut.

Also, could the fact that the larger force IS considering KP in the missions they tied/did worse than the smaller army?

I think this IS a step in the right direction, and I appreciate you going through all of those reports!

I think it might be better to have your own games and record them, or have two other players do games. This way you can see who wins when killpoints are and ARENT considered beforehand.

solkan
22-11-2009, 02:38
Given your assumptions, that the variables of player skill, terrain, army selection, number of scoring units and total number of units all balance out, you've made a start to eventually demonstrate your point. You would probably have done better to collect multiple battle reports per player and then attempt to demonstrate that those variables did in fact cancel out.

Here's an example from work:
We had on the order of 500 tests that need to be graded and two testers doing the grading. We wanted to just divide the tests up evenly and take it for granted that the two testers would be equivalent. During the ensuing lecture from the statistics guy, we discovered that to get the results published we needed to ensure that some of the tests were graded by both graders to demonstrate what the correlation factor between graders was. I think in the end each grader had to do 300 tests to give us enough of an overlap to feel okay.

From your data, (as an example) you can't rule out the possibility that your results only came out they way they did because terrain in every case favored the Space Marine players.

Bunnahabhain
22-11-2009, 02:59
Given your assumptions, that the variables of player skill, terrain, army selection, number of scoring units and total number of units all balance out, you've made a start to eventually demonstrate your point. You would probably have done better to collect multiple battle reports per player and then attempt to demonstrate that those variables did in fact cancel out.

There are two issues here.
Terrain should be totally independent of army and mission, and so have no influence on the figures.

The other one is trying to separate and codify complex and interlinked factors. How does one separate army choice and unit count, as the army choice has an influence on number of units? How do you rank player skill?

Short of collecting hundreds of battle reports per army, I can't see a way round this..

One possible solution would be to record the results and forces of every battle from a large tournament. There, you have equal sized forces, all built for competitive play, and you minimise several other variables. Still not ideal though.

Partisan Rimmo
22-11-2009, 03:11
Lord Inquisitor is a hero.

This thread is excellant work, and inspiring dedication. Even if it's not 100% right, this is the way all things should be done.

mattschuur
22-11-2009, 03:11
I've always been under the assumption that in a situation with mass amounts of variables and random luck it is best to keep things simple. Sure if you throw in player skills, army type, army power, build types, points, # of objectives, unit sizes, good luck, bad luck, etc. things are going to get very complicated. That's why I like Lord Inquisitor's math, he keeps it fairly basic with limited variables to keep the study simple, between army A and B.

The thing is, if kill point is a valid mission then it should pass the vacuum test, in other words, in vanilla limited variable tests it should work out as you expect it to, even. If the objective is balance between number of units then a small elite army would have no significant advantage over a large horde force in kill point. Lord Inquisitor's work points out that in a very basic scenario without player skill or army types, etc. kill point doesn't really work in the way it's intended. If something doesn't work in its most basic, limited and raw form, why do you think it can work when you throw in the variable dynamite? That's like building a house with a crooked, unbalanced and weak foundation. Once you start throwing the walls, roof, furniture and plumbing in, things just get worse. If Kill point doesn't work in a 2 variable test why do you think it'll work out fine in a 5, 10, 100 variable situation?

Matt schuur

Lord Inquisitor
22-11-2009, 03:20
However, the thing that is not being factored here is whether or not the larger killpoint armies were considering that they might be playing killpoints as well as objectives missions. This could very well be indicative that the transition to killpoints has not sunk in for players with larger army sizes who refuse to change tactics. That doesn't mean that *is* what the case is, but it isn't clearcut.
I don't think this is an issue assuming that the players roll for the scenario randomly. From the batreps usually its very clear that the players rolled for the scenarios. As long as they didn't know what scenario they were playing ahead of time, it doesn't matter whether they had a KP-optimised army or not.

Now, it is possible to imagine a case where the better generals have optimised their armies more for lower unit counts, which is a possible confounding effect, but I doubt it or we would expect to see a low-KP bias for all missions.

I think it might be better to have your own games and record them, or have two other players do games. This way you can see who wins when killpoints are and ARENT considered beforehand.
Getting a sample size of independent gamers adequate would be the hard part!

Given your assumptions, that the variables of player skill, terrain, army selection, number of scoring units and total number of units all balance out, you've made a start to eventually demonstrate your point. You would probably have done better to collect multiple battle reports per player and then attempt to demonstrate that those variables did in fact cancel out.

Here's an example from work:
We had on the order of 500 tests that need to be graded and two testers doing the grading. We wanted to just divide the tests up evenly and take it for granted that the two testers would be equivalent. During the ensuing lecture from the statistics guy, we discovered that to get the results published we needed to ensure that some of the tests were graded by both graders to demonstrate what the correlation factor between graders was. I think in the end each grader had to do 300 tests to give us enough of an overlap to feel okay.

From your data, (as an example) you can't rule out the possibility that your results only came out they way they did because terrain in every case favored the Space Marine players.
This shouldn't be a problem. In your case you have a systematic possible bias (the tester). For me, I'm assuming that the deployment of terrain isn't dependent on the mission - most people set up terrain and then roll for mission as that's how the rulebook specifies. Even if you roll for mission first, is that really liable to affect the deployment of terrain in a predictable way so that unrelated individuals all over the world will deploy their unique terrain in a similar way?

One possible solution would be to record the results and forces of every battle from a large tournament. There, you have equal sized forces, all built for competitive play, and you minimise several other variables. Still not ideal though.
That's a pretty cool idea though...

IJW
22-11-2009, 03:21
Terrain should be totally independent of army and mission, and so have no influence on the figures.
It should be, but sparse tables make it nearly impossible to play 'conserve the KP by hiding the last few models of the unit', which in turn makes it proportionately harder for large unit count armies to saturate their enemy with more units than can be shot at, and then protect/retreat the more damaged units.

Anyway, kudos to Lord Inquisitor for starting this, even if I'm not 100% agreed on the assumptions and confidence in the results from the sample sizes.

causative vs. correlative data

You're going to have expand on this, not sure what you mean.
Presumably:
Causative - situation X causes result A.
Correlative - situation Y causes result A, but situation X has some other link.

Without separating between situation X & Y, you have no way to know which was actually the cause of the data shift.

The classic example is the (apocryphal?) story of statisticians studying young children's intelligence and discovering a correlation between intelligence and shoe size.
Conclusion - big feet make you more intelligent.
Actual reason - children within a year group aren't all the same age, the ones that are nearly seven are more developed than the ones that are just over six, and also happen to have grown more.

Note - I'm not saying that this is what's going on, just illustrating what I think meri is talking about with causative v. correlative.

Meriwether
22-11-2009, 03:56
The example I give my students is this:

There is a very, VERY strong negative correlation between the number of goats in any given US county and the murder rate in that county. Rochester, NY has a very high murder rate. Solution: flood the city with goats.

ProfessorCurly
22-11-2009, 04:29
The issue with that example Meri, is that you're assuming it wouldn't work.

CthulhuDalek
22-11-2009, 04:39
FACT: All murderers are soothed by the warmth of sheep.

True story.

However, I am agreeing with Meri on this point. The point is to *not assume* that either of those are the correct solutions/results without substantial proof.

big squig
22-11-2009, 04:50
Eh...I'm not convinced (and I'm Mr. Get Rid of Annihilation). There's just too many variables.

Meriwether
22-11-2009, 04:55
The issue with that example Meri, is that you're assuming it wouldn't work.

I suspect that it would result in a temporary massive increase in the murder rate of goats, but am unwilling to commit to that without an experiment to test the hypothesis.

ProfessorCurly
22-11-2009, 04:59
A young goatling is coming home from the Opera with his parents, when a mugger comes to steal their Goat-jewelery. A gunshot, two, and the parents fall leaving the goatling alone in a cruel, cold world. On that night, a new guardian of the night was born.

The Baaaaaatman!

...

Sorry.

I'd say keep collecting data as it comes up and increasing the size of your sample to wash out those differences in skill/terrain/etc that might skew it one way or another.

toxic_wisdom
22-11-2009, 05:17
[COLOR="Magenta"]"...Yes, it's another Kill Point thread. I hope this one shows something new, so bear with me."...

Great another thread re: KPs... really ? :eyebrows:

What's the point of dragging this topic along any further, new material or old ?

Lord Inquisitor
22-11-2009, 06:24
The example I give my students is this:

There is a very, VERY strong negative correlation between the number of goats in any given US county and the murder rate in that county. Rochester, NY has a very high murder rate. Solution: flood the city with goats.
I just trod on a (plastic) goat and it made me commit murder, so I'm unconvinced ;) and yes, I really did... [edit: whoops I meant made me want to commit murder!! Noone actually died as a result of the goat/foot impalement...]

Yes, yes, point made. My example I give to my students is that one of the most heritable traits is bank balance.

Nevertheless, it does not make correlative studies worthless. Particularly if you start with an a priori causative prediction of a correlation, drawing conclusions from your correlation is not unreasonable. Do we really need to split hairs on this?

Great another thread re: KPs... really ? :eyebrows:

What's the point of dragging this topic along any further, new material or old ?
Hey, you want it to die? Don't bump the thread. :p

solkan
22-11-2009, 08:29
This shouldn't be a problem. In your case you have a systematic possible bias (the tester). For me, I'm assuming that the deployment of terrain isn't dependent on the mission - most people set up terrain and then roll for mission as that's how the rulebook specifies. Even if you roll for mission first, is that really liable to affect the deployment of terrain in a predictable way so that unrelated individuals all over the world will deploy their unique terrain in a similar way?

My point was that the two testers are a source of bias which we had wanted to simply dismiss as unimportant. But I think I may have gotten ahead of the more important point... First, a big disclaimer here that I'm not familiar with all of the details needed to conduct a defensible study.

The big point: What's the distribution of the games in the battle report forum for codex selection, player skill, and terrain setup? As a follow up question, how do those distributions relate to the distribution of players in general?

Was the terrain in each battle report set up properly according to the rulebook? Whether it was or not (including more or less than the recommended amount) could be a bias in favor of one answer or the other.

Certain armies have more effective troop choices than others, and certain armies simply tend to have fewer kill points than others. This would mean that as the army distribution changed, the overall win/loss ratio for the battle reports would change.

In my case at work, we had the students take a test, perform some activity, and then take a second test. We then compared the results of the two tests to try to find an improvement caused by the activity. In my case, if my boss saw a difference in the scores caused by the two different graders and attributed it to the activity, that would be bad. Because it's a factor which interferes with the factor we're trying to detect, and because we know about it, we correct for it. (Of course, if the two graders had been used in secret and we didn't know about them, then we wouldn't have had to correct for their bias, but then we would have had a hidden variable to deal with, and the implications are making me sad...)

After writing all of that and expounded more than I should on something I don't do professionally... Yeah, discounting terrain setups, time of day, lunar cycles, and other independent variables is probably okay if you have a sufficiently large sample size, or you checked for and removed battle reports which had divergent terrain setups. But without having those shockingly huge sample sizes which the other posts mentioned, bias caused by terrain, codex choice and all of those other possibilities could be the primary cause of your results instead of your hypothesized effect (or lack thereof).

Meriwether
22-11-2009, 15:29
Correlative studies are not useless, certainly. But they're not causative proof of anything, either, so that's clearly a flaw in your study.

I'm just nit-picking science-wise. Your conclusion *may* be correct -- you just don't have a study that says so one way or the other.

Lord Inquisitor
22-11-2009, 15:31
The question is, does the unknown variable (terrain amount/placement) affect the results in a systematic way?

Let me put it this way, if you had used a different tester for each and every trial, would you have needed to account for tester bias? The problem with doing this is that it introduces more variance (and therefore requires you to increase your sample size) and of course, it's rather more resource-intensive to get a new tester every time!

Unless you think there's any reason why terrain placement shouldn't be independent from mission, why should it matter? Even if certain terrain should favour certain armies, what difference does it make to the outcome of the analysis? Providing the amount of terrain is variable at random, all it does is add variance to the system, which weakens the power but doesn't invalidate any outcome. Indeed, if there were to be some big bias in Warseer reports that favours, say, the smaller army (which is difficult to imagine), such a bias would provide an equal advantage in all mission types.

A serious problem would be if terrain were set up differently for each of the scenarios, but there's no reason to suspect that as the mission should be rolled for after terrain placement. We could account for the variance including that caused by terrain placement perhaps using a mixed model anova, but again, it seems like a lot of effort just to increase the power - and we know the power is enough because significance was achieved.

Correlative studies are not useless, certainly. But they're not causative proof of anything, either, so that's clearly a flaw in your study.

I'm just nit-picking science-wise. Your conclusion *may* be correct -- you just don't have a study that says so one way or the other.

This isn't a flaw in my study because ultimately it doesn't matter if my causal hypothesis is right or not.

Let me put it in a more respectable format for you. The pro-KP argument is that the bias for small-unit-count armies in annihilation is balanced out by a bias in objective missions. The actual causes of these biases are not necessary to test this argument.

There is no correlation between high unit count and advantage in objective missions, while there is for annihilation. That's enough to show that there is an imbalance, the causal reasons for this are another matter, but it isn't relevant to the argument as to whether there is an imbalance, which is the real crux of the argument.

Meriwether
22-11-2009, 16:03
I wasn't talking about confirming your hypothesis -- you and I both know that lots of good science results from testing hypotheses that turn out to be wrong. I don't think you can even make the conclusion that there is a correlation, because of the problems mentioned before (and many more).

One of the largest flaws could be dealt with by using a large tournament as a sample set, and that is the difference between games where people knew the mission before they made their army lists, and those where they made 'all comers' lists without prior knowledge of the missions.

That would also help eliminate the flaw that people don't tend to post all batreps, but pick and choose based on arbitrary criteria that makes it worth their while to do so. How this effects your data we have no idea.

It would also help isolate outliers -- people who almost always lose or almost always win regardless of the game parameters -- so that you could isolate skill as a variable and at least to some degree account for it.

There are many, many other sources of error, both random and systematic, that I am certain effected your data -- and you have no idea how. (In my 'density of a block' lab, we typically identify between 13-25 sources of error, and all we're doing is measuring the dimensions of a block (l, w, h, mass) and determining whether or not it will float, and how much, then testing it.)

Unless you are actively searching for sources of error and then working to account for or eliminate them, what you are doing is barely classifiable as science. I would bet that you have at least a hundred sources of error -- most of which cannot be eliminated, but could be accounted for in some way with further study.

So like I said, it's a good start, and I like the effort. But making conclusions based on what you have is far too overconfident.

BigBossOgryn
Skyth
22-11-2009, 16:30
Yes, physical violence towards people who enjoy playing in a different way will make things better. </sarcasm>

Lord Inquisitor
22-11-2009, 16:37
Unless you are actively searching for sources of error and then working to account for or eliminate them, what you are doing is barely classifiable as science. I would bet that you have at least a hundred sources of error -- most of which cannot be eliminated, but could be accounted for in some way with further study.
Here I disagree. There are certainly ways to improve the study with more time and effort (controlling for battle points size, including a control, better random sampling and I agree that the tournament controlled setting would be a good one) but you seem to want to control for every variable and this isn't necessary as long as the variables are independent from the test/response variables. In biology there are invariably vast numbers of variables, so this is an important distinction. Not that there's anything wrong with eliminating error, but biologists and statisticians have a different meaning for it - we're talking about variance that reduces power, rather than something that invalidates the results.

These sorts of studies are done all the time in biology. I'll see if I can find some examples. I'm not saying this is perfect and there are certainly flaws, but it isn't as fundamentally flawed as you seem to think. For example, I have assumed that mission is rolled for after list design, which is the correct way of playing from the rulebook. Certainly the majority of reports made this clear and obviously any that did specify mission before army list would be excluded. This may not be ideal, but it isn't a bad assumption. Even if they did, does it even matter why there is a KP differential, just that there is one?

Bunnahabhain
22-11-2009, 16:43

Can't we just punch people who abuse small/elite/large/whatever armies?

But we need stats to define what small, large and even abuse mean in context...

A couple of us have now suggested using a tournament as a data set. Where is the next big tournament, and can we find someone round here who's going?

Ianos
22-11-2009, 18:40
A couple of us have now suggested using a tournament as a data set. Where is the next big tournament, and can we find someone round here who's going?

If it is any good i will try to record all the games i watch from now on. There are also a couple of big tournaments up in Athens that i might get the raw data from.

However, i think we are disregarding a VERY important variable here. All the data collected and used are from an environment where Kill Points ARE in effect. As such people will design and play lists accordingly and will not go MSU as much as they would go without KPs. What is done here is similar to testing a drug on healthy subjects. The fact of the matter is that we can only appreciate the value of KP missions only if we had another sample in which no KPs where played but only one opponent was informed about it prior to the game. Then if the results where the same for objective missions in both sets of tests, could we regard the KPs as having a placebo effect. In any other case KPs would be considered useful.

This in effect, means that the whole investigation might point to the validity of KP usage, as they could eventually prove so effective in keeping objective missions in check.

Skyth
22-11-2009, 22:41
I disagree. It doesn't matter if kill points are in effect or not. Even if kill points were not in effect, the hypothesis that armies that have larger kill point total armies have an significant advantage in Objective missions could be tested.

Eldoriath
23-11-2009, 00:17
Even though 80 is a small number it will show a clear indication if very high deviations occur. Throw a dice 26 times and see if 1-3 appears almost the double amount as 4-6 does (analogy for the annihilation missions). As for variables terrain and such will certainly even out in the end. Considering player skills being a variable you must remember this:
There are always two players, and if that should be a considerable variation it must mean one, or more, of these three imo:
A) More skillfull players play smaller armies.
B) Missions work in favor for smaller armies.
C) Terrain and/or other deviations favor smaller armies.

Cause the study can show a steady advantage for smaller armies, being almost no difference in objective missions, but massive difference in annihilation missions. So for me it boils down to these three main options, and what is true maybe is hard to tell. Perhaps more skillfull players tend to play smaller armies, but I feel more inclined to feel that missions has a favor for smaller armies rather then the other options.

As for objective missions I think like this:
In C&C it will often be enough to have one objective and denying the other.
In seize ground it will often be enough to hold two objectives and deny the others.

Smaller armies tend to manage C&C with ease, and Seize ground quite well since they tend to have hard units able to tear holes in the enemies army thus denying the advantage of more scoring units.

ehlijen
23-11-2009, 00:46
Did this analysis also address the fact that the army with fewer units is more likely to cause overkill on the smaller enemy units, due to wounds not carrying over between units, but is just as likely to have to suffer morale tests, due to casualties from multiple attacks stacking to determine the 25% casualties, and more likely to be pinned as more distinct units with pinning weapons have the potential to cause more tests than fewer units with multiple pinning weapons?

I did not see this addressed in the initial post, but I'm not sure if I just missed it. There is more to balancing killpoints than just the objective missions. In fact apart from kill points, the rules system at its most basic level encourages more but smaller units.

More units means more options to do things. On top of that, more units often means what units you have are more mobile (transports). That all needs balancing too.

Skyth
23-11-2009, 00:52
It doesn't need to. The only thing being addressed is the hypothesis that Larger KP armies are superior to Smaller KP armies in Objective missions. It does not (Nor does not need to) address why they would be superior or inferior.

Ianos
23-11-2009, 03:09
It doesn't need to. The only thing being addressed is the hypothesis that Larger KP armies are superior to Smaller KP armies in Objective missions. It does not (Nor does not need to) address why they would be superior or inferior.

But right now no matter how we put it, we DON'T see the full scale of how high unit counts armies can have. Like it or not, the Larger KP armies are not as large as they would be without KPs keeping their designers in check.

Apart from that, as i mentioned in the other thread and as ehlijen says, there is more to MSU than just mission mechanics. Vehicles for example consume tabletop space, provide targets and deny cc. An army with MSU can give itself more cover, better targeting and on top of that have more of the powerful options in every unit. We have some plague marines with two plasma and a fist when you can have 4 plasma and 2 fists for +50? Please don't tell me its the points cause i've plaid 4th ed and even without objectives people would always go MSU, i cannot even imagine what would happen with objectives and without KPs.

Vaktathi
23-11-2009, 03:19
Did this analysis also address the fact that the army with fewer units is more likely to cause overkill on the smaller enemy units, due to wounds not carrying over between units, but is just as likely to have to suffer morale tests, due to casualties from multiple attacks stacking to determine the 25% casualties, and more likely to be pinned as more distinct units with pinning weapons have the potential to cause more tests than fewer units with multiple pinning weapons? Overkill is sometimes an issue but not always, and armies with lots of KP's can and do run into issues where firepower becomes *too* distributed and is unable to be effectively concentrated. More "elite" armies with smaller unit counts are generally harder to hurt and harder to force morale and pinning tests on than something like Fire Warriors or Guardsmen, balancing out much of that issue. trying to pin or force a morale test on T4 3+sv Ld9 space marines with Mortars is far more difficult than trying to pin or break T3 5+sv Ld8 guardsmen.

Like it or not, the Larger KP armies are not as large as they would be without KPs keeping their designers in check. How so? I don't really see any evidence for this. We've really only had one high KP count book come out since 5E was released, and if anything my KP count *increased* as I could afford so many more units.

ehlijen
23-11-2009, 03:29
Overkill is sometimes an issue but not always, and armies with lots of KP's can and do run into issues where firepower becomes *too* distributed and is unable to be effectively concentrated. More "elite" armies with smaller unit counts are generally harder to hurt and harder to force morale and pinning tests on than something like Fire Warriors or Guardsmen, balancing out much of that issue. trying to pin or force a morale test on T4 3+sv Ld9 space marines with Mortars is far more difficult than trying to pin or break T3 5+sv Ld8 guardsmen.

How so? I don't really see any evidence for this. We've really only had one high KP count book come out since 5E was released, and if anything my KP count *increased* as I could afford so many more units.

Sure, breaking guardsmen is easier than breaking marines. But it's not 3 times as easy, and one unlucky roll doesn't see 10 marines worth of guardsmen run away, unless they've been merged (which noone would do if KP didn't exist).

And IG KP count increased? Despite vehicle squadrons, more expensive elite choices, merging platoon squads and officers no longer being ICs?
In the previous book the minimum was 7 KP, now it's 3 with more potential to cram points into fewer KP.

Skyth
23-11-2009, 03:34
Again, it doesn't matter. The postulation is only that having a higher kill point total is enough of an advantage in objective missions just by having more units on the board that kill points is neccessary to balance it out. The study is not dealing with the efficiency of units nor does it need to (especially since it would be assumed that the efficiency or lack thereof would balance out among the samples).

Also, using false data doesn't help your case. The 4 plasma and 2 fists is +80 points, which is almost doubling the cost of a minimum squad. This is assuming you already bought at least 10 Plague Marines so you could have minimum sized squads.

Skyth
23-11-2009, 03:35
unless they've been merged (which noone would do if KP didn't exist).

Orders and stubborn giving commisars actually makes it useful.

Ianos
23-11-2009, 03:52
Again, it doesn't matter. The postulation is only that having a higher kill point total is enough of an advantage in objective missions just by having more units on the board that kill points is neccessary to balance it out. The study is not dealing with the efficiency of units nor does it need to (especially since it would be assumed that the efficiency or lack thereof would balance out among the samples).

How doesn't it matter when the lists are simply NOT as much MSU as they would be without KPs? How, when unit efficiency due to MSU can give the edge on all missions including objectives? What this thread is doing right now, is prove that because of KPs, MSU armies are indeed not that effective in objectives as they would be without KPs.

Also, using false data doesn't help your case. The 4 plasma and 2 fists is +80 points, which is almost doubling the cost of a minimum squad. This is assuming you already bought at least 10 Plague Marines so you could have minimum sized squads.

I stand corrected, i meant meltaguns. But again, you evade the fact that before KPs MSU was the norm and that people where willing to pay up for it.

Vaktathi
23-11-2009, 04:20
Sure, breaking guardsmen is easier than breaking marines. But it's not 3 times as easy, and one unlucky roll doesn't see 10 marines worth of guardsmen run away No, but those same marines can auto-regroup unlike the guardsmen, or half the time are fearless/LD10 rerollable/stubborn, etc. And are much more capable in every other way as well. Just because they have a point for point deficiency in one relatively minor area (lets face it, Ld9 ATSKNF or Ld10 rerollable/Fearless SM's don't care too much about morale/pinning) doesn't mean they aren't 3x as capable once all their abilities are considered. If you look not only at higher resistance to Morale/pinning, but also CC ability, special rules, shooting, etc, and take all of that into account, I'm really not seeing the issue.

unless they've been merged (which noone would do if KP didn't exist). Yeah, they would. While it has its issues, having 50 Stubborn Ld9 dudes with 5 heavy weapons, 5 specials, and a vox for rerolling Orders for what, 350pts, that sits in cover on an objective is a pretty mean unit. Vulnerable to assault by a high A unit like 'Zerks or Boyz tearing into it, but a very difficult unit to defeat otherwise that can basically pick almost any infantry unit of roughly equivalent cost or less within 24" and terminate it between a decent FRFSRF and the massed heavy/special weapons.

As an IG player, you can't have missed that surely?

And IG KP count increased? Despite vehicle squadrons, more expensive elite choices, merging platoon squads and officers no longer being ICs?
In the previous book the minimum was 7 KP, now it's 3 with more potential to cram points into fewer KP.If you aren't running Platoons, then merging them isn't an option. Even then, that's only *1* unit in *one* troops choice that can do that, and doesn't apply to the other 4 units in that troops choice or to the other two Troops choices (Veterans and Penal Legion).

Officers no longer counting as IC's is nice indeed, but everything dumped hugely down in points to counteract that.

Vehicle squadrons are an option, but only if you are running multiples of the same chassis, and are awkward as all hell unless the squadroned vehicles are identical. Squadrons are there to allow one to fit in more stuff if you run out of FoC slots, they shouldn't be there just to reduce KP's. Having a victory outcome that provides incentive to field units in intentionally awkward manners where there is no need to otherwise, sounds like daft game design to me more than anything else. It sounds like something thrown in for it's own sake rather than for any constructive purpose.

Only ST's and Ogryns are more expensive IIRC in Elites, and very few people take them anyway. Vets got moved to troops (and got cheaper, although they lost Infiltrate), Ratlings are identical, and the psyker squad didn't exist, and neither did Marbo (the easiest KP ever, although pretty ridiculous for his cost).

You can get lower KP counts if you stick to one type of Troops unit and intentionally use your vehicles in such a manner as to tie one hand behind your back, but I've noticed average KP count, if anything, slightly increase if from nothing else than additional chimeras and vet squads.

The 2000pt list I'm running now (assuming Vendetta/Valks had been in the last codex at all and with their current costs) would be about 540pts more than it is now, and I'd be 2 KP's shorter (taking into account double KP cmd squads), using old IA:1 costs for Valkyries/Vendettas, the army would be 720pts more and be about 3 KP's shorter (again taking Cmd squads into account) and 6 units shorter total. If I merged that platoons I'd still be up by 1 KP.

If I wanted to intentionally make the list almost unplayable and squadron all the FA/HS vehicles, I'd be down by 3 KP, comapared to the previous book, but I'd still have 16KP's, 2 more than either the CSM or Eldar lists I also run (14KP's each, roughly average if a bit on the high side), with significantly less capability and far more vulnerable and less effective fire support units as a result. So even going out of my way to intentionally make the list as least usable as possible to get the lower KP count and make the big guns as vulnerable to enemy destruction as possible, the army *still* has a higher KP count than equivalents. But yes in that context it would have fewer KP's than the previous codex, even if still higher than contemporaries.

I don't think KP's in such a context really result in a better balance situation or gameplay experience from this PoV.

Minimum KP count in the previous book also was only 4, Command squad +2 units of Grenadiers.

Skyth
23-11-2009, 04:43
I stand corrected, i meant meltaguns. But again, you evade the fact that before KPs MSU was the norm and that people where willing to pay up for it.

And your numbers are still wrong in a way to make your argument more favorable to you even if you are using meltas instead of plasmas. And before, the heavy and special weapons were less expensive.

Not to mention, A) Not everyone went MSU and B) There isn't anything particularly wrong with going MSU. It is a different style of play than some people use. In the new codexes you pay for the ability and lose out on durability.

And all this is besides the point that mathmatically (And by the study), having a high unit count army does not give you the advantage in objective missions to come any where close to matching the disadvantage it is in Kill Points mission.

Lord Inquisitor
23-11-2009, 04:51
Skyth is spot on. It doesn't matter, because all I've looked at is the unit count over each mission - any advantage or disadvantage of MSU in general terms should be consistent across the mission types.

Furthermore, if there were a bias as you suggest, ehlijen, it is towards the larger army, which did not appear in the results, even without considering the annihilation scenario.

I personally think that ideally (and to some degree in practice), MSU has already been dealt with at a codex level. The days of the 6-man las/plas are dead due to codex structure, not KP. The advantages for MSU can be accounted for in points values (e.g. free special weapons for large squads, etc).

That said, it is just my opinion. The salient point remains that whatever putative advantages MSU armies have that are not accounted for by points values, they do not translate into performance on the battlefield in objective scenarios or balance out the apparent disadvantage in KP missions.

Ianos
23-11-2009, 05:15
And your numbers are still wrong in a way to make your argument more favorable to you even if you are using meltas instead of plasmas. And before, the heavy and special weapons were less expensive.

So because the two more meltas are +60 none will use PMs like that? And when the last CSM codex was out we didn't have 4th ed and people didn't use min. sized units?

Not to mention, A) Not everyone went MSU and B) There isn't anything particularly wrong with going MSU. It is a different style of play than some people use. In the new codexes you pay for the ability and lose out on durability.

In 4th almost EVERYONE went MSU, in fact it was one of the biggest rants here on warseer. If you think there is nothing particularly wrong with MSU then KPs should be ok for you too. They both penalize people acording to something arbitrary. KPs just measure how many units you have and MSU gives you the edge if your codex allows you more. The DE player with 25ish units has the edge over the Ork player with 7-8 (cause he has to go large mob). This is where KPs come in and make the 25 to 14 or so and balance it out.

And all this is besides the point that mathmatically (And by the study), having a high unit count army does not give you the advantage in objective missions to come any where close to matching the disadvantage it is in Kill Points mission.

Quoting myself from the other thread:

Finally but perhaps most importantly, his math (Lord Inquisitor's) revolve around the algebric element of 40K. It is essentially mathhammer but on the strategic level rather than the tactical. However, the game encompases three mathematical elements, that are all required to be taken into account, when assesing the value of units, strategies and tactics. The algebraic, which mostly deals with the strategic level, KPs/unit numers, AT/anti-infantry guns amount, shooters/assaulters amount etc. The probabilistic which regards the mechanics of dice (usually identified as mathhammer) and the geometric which involves movement, range, board and terrain, and model size. This last one is the one MOST frequently ignored epecially in the net. For example sure one can have sufficient z and according to Lord Inquisitor not have a problem in OBJ missions, but what about 6 rhinos blocking his way to the objective. What happens when one transport which can be a grabber or a denier can move 12" on in case of the Eldar 36", That is a HUGE 3x difference in grabing. DE can even assault at 21"-27"+d6 range and thus essentially claim and deny even more so.

Math is good, as long as we have the full picture and not only one element, otherwise that half-truth can be more deceiving than a biased oppinion.

The salient point remains that whatever putative advantages MSU armies have that are not accounted for by points values, they do not translate into performance on the battlefield in objective scenarios or balance out the apparent disadvantage in KP missions.

How do you know this, when your theory and data ARE influenced by the existance of KPs. How do you know how unbalanced MSU would be if people just went rampant with it?

ehlijen
23-11-2009, 05:46
Why would I merge 50 men to get ld9 if I could just have 5 seperate units of 10, each ld10 from a commisar lord sitting on an objective? That's much more work to kill and I still have the option of having part of the unit move to another objective instead.

It's a simple fact that having more choices to make (ie more units to command) while giving the opponent more different answers for his constant number of choices (ie more targets to pick 1 from) is an advantage.

The newest codices are giving you a choice: You can have firepower, mobility, resilience, target saturation or you can be KP effective. You can't have all of them at once, so you either aim for a balance or accept that in some missions your specialised army isn't going to fare well.

If you don't like things that reduce your KP count that's fine. But then it's not the Codex's fault that your army isn't good in KP missions.

It's the same with obliterators. You can get maximum efficiency by taking 3 sperate ones, or you can not present the enemy with 3 easy KPs. It's a choice, and one that has been sorely missing from the game before KP came along.

MSU has not been entirely dealt with (if you discount KP missions as doing so). We've still got single speeders vs squadrons, rhinos/chimeras, thunderwolves, attack bikes, heavy weapon unit spam in platoons and more. And that's not even counting the 'for 5th ed codices' such as chaos, orks etc. It's not as bad as it once was, but throwing out KP might as well make combat squading compulsory.

CthulhuDalek
23-11-2009, 06:24
While I basically agree with Ehlijen, the effectiveness of MSU is already being discussed in the other KP threads anyway.

The point that should be focussed on, really? The fact that Lord Inquisitor's data already includes the effect of KP, without showing the alternative (Basically agreeing with the last point Ianos made).

Another thing I realized about the data is that you're only receiving the data from that player based on what they felt like posting. Not to say they'd necessarily lie or even fake data, but maybe they'd only post a battle report they they felt was better than the norm, or even horribly worse!

"Ah man, I got toasted! I've got to show all the craziness that ensued (compared to an average game)"

or

"My best victory yet! I can't believe my grots killed Abaddon..."

And such.

Vaktathi
23-11-2009, 06:32
Why would I merge 50 men to get ld9 if I could just have 5 seperate units of 10, each ld10 from a commisar lord sitting on an objective? That's much more work to kill and I still have the option of having part of the unit move to another objective instead. A commissar lord is easy to take out (focus on him or just the one squad he's in, not to mention very expensive) and if they are *all* on an objective that means they are all in multiple assault radius still of big CC units (meaning they can still all be hit in a big assault), and you can't issue on order to all of them at once, and they are easier to break individually, and then you need more vox's if you want rerolls, resulting in increased cost over and above that of the more expensive commissar as well.

You still can't see the utility in amalgamated units? Even I saw the light eventually. The problem is, this again is only one of many troops units that you can do this with, meaning the others are going to inflate the KP count very quickly.

It's a simple fact that having more choices to make (ie more units to command) while giving the opponent more different answers for his constant number of choices (ie more targets to pick 1 from) is an advantage. Yes, however I don't see how that's so incredibly unbalancing that it requires KP's. That's how horde armies have always and will always work. One army has more targets to deal with, but lives longer and is more capable man for man. From this perspective, KP's sound like nothing more than a crutch for armies that somehow can't competently leverage their greater abilities to defeat larger numbers of weaker foes. I don't see any merit in that. I also don't see that most Elite armies have *so* few units that they just can't compete in objective games. Again, in C&C numbers matter little, likewise in 3 point SG, it's only really in 4 and 5 point SG that the numbers start to matter, and even then most armies aren't going to get terribly stressed unless they are trying to play with 6 KP's in a 2k game with their 2 Nob Biker squad lists.

The newest codices are giving you a choice: You can have firepower, mobility, resilience, target saturation or you can be KP effective. You can't have all of them at once, so you either aim for a balance or accept that in some missions your specialised army isn't going to fare well. I don't see this issue with Orks, Space Marines, or Daemons (well, daemons I haven't seen yet be very effective at much of anything, but that's besides the point). They don't have to go to the ridiculous extremes that IG does to balance these. "conservative" SM kp lists might be 10/11 at 2k, wheras KP heavy might be 15/16 with an average around 13-15. At 2k, the average IG or Tau list will be far in excess of the average SM list, even when on the light side. Ork lists can sit there with 6 very powerful scoring units, with enough models to flood the board from side to side, have point left over for some speed and shooty bits, and still come in at around 12 KP's. They've got enough scoring units to take a bunch of objective and enough size that each unit can often hold or contest more than one objective.

If you don't like things that reduce your KP count that's fine. But then it's not the Codex's fault that your army isn't good in KP missions. It's not a matter of liking them or not, it's a matter of a nonsensical victory condition favoring incoherent and illogical unit combinations for no discernable purpose making for awkward and unrealistic tactical decisions and game outcomes resulting ultimately a poor game mechanic, ostensibly to balance out some advantage that is minimal at best in an entirely different scenario.

And if the only way to reduce the KP count results is making the army playing incredibly poorly, and/or play with only a very limited scope of units, then yeah, it is kinda the fault of both the codex and the core game mechanics (really more the latter). KP's force restrictions that reduce overall unit viability just as much, if not moreso than VP's, but also result in more bass ackwards list building and in game decisions (like targeting gun drones over Fire Warriors or squadroning a Medusa, Bassy and Collossus in the same FoC slot to reduce KP's).

It's the same with obliterators. You can get maximum efficiency by taking 3 sperate ones, or you can not present the enemy with 3 easy KPs. It's a choice, and one that has been sorely missing from the game before KP came along. Sure, but who takes 3 single Oblits? I've never seen an army do that. Usually it's a choice between like two units of 3 and 3 units of 2. Not seeing a huge balance need for KP's there. That might make a difference of 1 or 2 KP's in a CSM list? I don't see how that justifies the massive imbalance that it causes in other armies. As long as you've got the extra FoC slots I don't see why there is any balance reason to force them to be taken in anything less than the optimal manner, as everyone has those same FoC slots and an ability to maximize their utility. If different armies had different FoC slots, that argument might be different, but they all have the same number of slots to use.

MSU has not been entirely dealt with (if you discount KP missions as doing so). We've still got single speeders vs squadrons, rhinos/chimeras, thunderwolves, attack bikes, heavy weapon unit spam in platoons and more. And that's not even counting the 'for 5th ed codices' such as chaos, orks etc. I don't see that it's something that is so horrifically broken that it's such a concern that KP's are needed and requires such great lengths to be dealt with. Most min/max issues were poor codex design allowing a 5man T4 3+sv BS4 troop unit to take a lascannon and plasma gun for 100pts and allow the rest of the list to load up on heavy CC units and big guns that could all take objectives just as well.

It's not as bad as it once was, but throwing out KP might as well make combat squading compulsory.Except that you still have wound allocation taking out the special guys easier, greater vulnerability to morale tests, easier to overwhelm in CC, etc.

CthulhuDalek
23-11-2009, 07:14
Except that you still have wound allocation taking out the special guys easier, greater vulnerability to morale tests, easier to overwhelm in CC, etc.

Because I consider most of the rest of this debate as part of the other KP threads I'll let you two duke it out. But this point I felt I should comment on.

The wound allocation is almost the same for a tactical squad who has combat squadded. You'll still hit 1-2 special guys and 2-4 regular guys. This is negative compared to a full tactical squad except it's canceled out easily by the fact that you now have to divide your shooting amongst two units to have completely gotten rid of the threats.

Yeah, but it also means you get twice as many chances to pass those morale tests and still be able to capture an objective. Also as you know for marines, failing a morale check still isn't the end of the world.

Easier to overwhelm in CC is true, but most units that overwhelm in close combat might overwhelm a full unit anyway (not always true), and are thus wasting their full potential on a half unit.

I usually use 3 tactical squads and combat squad them, leaving two-three behind on a single objective and running the other 3 in a group or in rhinos(I would run razorbacks but I bought the rhinos before razorbacks were so...nice)

Vaktathi
23-11-2009, 08:05
The wound allocation is almost the same for a tactical squad who has combat squadded. You'll still hit 1-2 special guys and 2-4 regular guys. This is negative compared to a full tactical squad except it's canceled out easily by the fact that you now have to divide your shooting amongst two units to have completely gotten rid of the threats. Yes, true you have to divide your shooting, but now I only need to put 5 wounds on a unit to have a chance to kill off that flamer and the hidden powerfist, as opposed to 10.

Yeah, but it also means you get twice as many chances to pass those morale tests and still be able to capture an objective. Also as you know for marines, failing a morale check still isn't the end of the world. True, but it's still something to keep in mind.

Easier to overwhelm in CC is true, but most units that overwhelm in close combat might overwhelm a full unit anyway (not always true), and are thus wasting their full potential on a half unit. Perhaps, but with a 5man CSM squad I might be more tempted actually attempt an assault with IG or Tau to try and contest or even capture than I will with a 10man squad.

I think there is a place for both, in KP missions and in objective missions. I just don't see the need for KP's to so harshly penalize this.

I usually use 3 tactical squads and combat squad them, leaving two-three behind on a single objective and running the other 3 in a group or in rhinos(I would run razorbacks but I bought the rhinos before razorbacks were so...nice)That sounds exactly like what they were supposed to do, but I also can see where 10man squads would be nice to overwhelm an opponent locally as well in some situations. I don't see the need for KP's to penalize the use of the combat squads in former manner to the degree that they do.

Ianos
23-11-2009, 10:39
It's not a matter of liking them or not, it's a matter of a nonsensical victory condition favoring incoherent and illogical unit combinations for no discernable purpose making for awkward and unrealistic tactical decisions and game outcomes resulting ultimately a poor game mechanic, ostensibly to balance out some advantage that is minimal at best in an entirely different scenario.

I don't like KPs as a mechanic either. But i accept them as an equalizer. You see i find nonsensical that a single 6 point Ork hiding behind a wall due to true LOS can win the game despite the fact its the last ork on the table. I also find nonsensical the fact that the entirety of a unit must shoot at one target.

Also, it is a fact that people will go MSU if they can and are not stupid. Why have 5 and 4 destroyers when you can have 3-3-3 or, why have 2 ten man whych squads with only 2 succubi when you can have three 5 man at about the same cost? Why not spam rhinos and razorbacks with min sized zerker/plague marines or always combat squaded space marines? Why should a Tau make his FWs squads full and not split them all? The list goes on and on, and its exactly what people did before KPs and without Objectives being so important. Now that transports are tougher and units give cover to each other and 2/3 games are objectives they would do even more so IMO.

And if the only way to reduce the KP count results is making the army playing incredibly poorly, and/or play with only a very limited scope of units, then yeah, it is kinda the fault of both the codex and the core game mechanics (really more the latter). KP's force restrictions that reduce overall unit viability just as much, if not moreso than VP's, but also result in more bass ackwards list building and in game decisions (like targeting gun drones over Fire Warriors or squadroning a Medusa, Bassy and Collossus in the same FoC slot to reduce KP's).

First of all Gun drones need a fix and smart tau players use SMS. Secondly, why should the Ork guy get penalized for needing large mobs and hence have few units while the IG player has triple the units, target saturation, fire concentration and freedom of movement along with objective/movement denial?

Vaktathi
23-11-2009, 11:11
I don't like KPs as a mechanic either. But i accept them as an equalizer. I just don't see the case for KP's serving as an equalizer having been made well. At best they cause at least as many problems as they solve, and can result in counterintuitive game outcomes even amongst relatively identical armies. If you kill a couple rhino's a land speeder and a scout squad, but lose your 10man terminator squad, a Land Raider, and your Chapter Master, you somehow win despite having your army suffered far more grievous losses than your opponent.

You see i find nonsensical that a single 6 point Ork hiding behind a wall due to true LOS can win the game despite the fact its the last ork on the table. If it's an objective game and he's the only guy near an objective, well then maybe he's the one that pressed the big red button that made all the beakie stuff in orbit go boom!

If it's an annihilation mission, then yeah, I think it's pretty dumb unless the only enemy left is a single guardsmen or gaunt. I can deal with the abstraction in objective missions as it can at least make some sort of sense there, the objective is all that matters and everything else is secondary, whether anything gets shot or not doesn't matter as long as you hold the objectives. With annihilation, the abstract of KP's makes little sense at all.

I also find nonsensical the fact that the entirety of a unit must shoot at one target. Eh, it's not too off, a bit as dudes would be engaging whatever came through their LoS, but it works well enough for what it is and means we don't have shooting phases that are conducted on a model by model basis and take 7 hours to get through rolling each weapon and model individually. I can deal with that sort of abstract.

Also, it is a fact that people will go MSU if they can and are not stupid. Why have 5 and 4 destroyers when you can have 3-3-3 or why have 2 ten man whych squads with only 2 succubi when you can have three 5 man at about the same cost? I'm not seeing why that is so terrible. It's simply an intuitive tactical decision. I don't see the need to counter that for any reason, why is it so terrible for players to use their units/models in the most optimal manner possible? It's not like Space Marines have to field Terminators in less than 10man squads, or that CSM's have to take their troops in 20man units, or IG forced to take a minimum of 3 LRBT's for HS slot. I don't understand why there should be an incentive for this when each army has significant latitude in each of these things as is.

Why not spam rhinos and razorbacks with min sized zerker/plague marines or always combat squaded space marines? Because such units are easier to isolate and destroy? I don't know of anyone who, even under 4E, took minimum sized zerker or PM units, they just don't work well at their minimum squad size, they need a couple extra dudes. I'd still field my CSM's in their 10man squads. In fact, I don't think it would change my list in any of my armies and result in a greater unit count if KP's were eliminated.

Why should a Tau make his FWs squads full and not split them all? Why should they have to take full 12man squads? Why shouldn't they split them? Why is this so unbalanced? Points costs and FoC slots are there to provide the cap on this sort of thing, KP's are just another hamfisted method of accomplishing this without regard to the other two mechanisms in place and lack any attachment to unit value resulting in games of annihilation that can have no relation to who actually annihilated the other army better.

The list goes on and on, and its exactly what people did before KPs and without Objectives being so important. Now that transports are tougher and units give cover to each other and 2/3 games are objectives they would do even more so IMO. I'm still not seeing what's so terrible about not taking everything in a maxed out unit? Yes, there's it's more flexible, but it also takes up an additional FoC slot (restricting other choices), may often be more expensive (as one has to buy additional upgrades like icons/more weapons/etc) and are easier to destroy. I find that balance enough. It's not as if just about every army out there (save maybe Necrons) doesn't have rather wide latitude in using its units in such manner.

First of all Gun drones need a fix and smart tau players use SMS. SMS is not exactly a cheap upgrade

Secondly, why should the Ork guy get penalized for needing large mobs and hence have few units while the IG player has triple the units, target saturation, fire concentration and freedom of movement along with objective/movement denial?
Because the Ork player can use that one unit to capture/contest more than one objective (seen it done on more than one occasion) that Ork unit doesn't care about morale until you've already killed more than 2/3rds of it, and is big enough to engage multiple units in CC at once, and can eat most similarly costed units, and whats more, if equipped with shootas and big shootas, is putting out as much firepower as an IG platoon, albeit not with as much control while still being far and away killier in CC with a better toughness to boot. Roughly equal shooting, better CC, higher T, ignoring morale, hugely powerful hidden powerklaw...whats not to like?

That ork mob is going to last longer and be just as nasty, if not moreso, than an IG platoon, even given that it's got less control than the IG platoon. Totally worth it.

Bunnahabhain
23-11-2009, 11:33
Secondly, why should the Ork guy get penalized for needing large mobs and hence have few units while the IG player has triple the units, target saturation, fire concentration and freedom of movement along with objective/movement denial?

If you're looking to point out the problems with large units, Ork mobs are about the worse unit to do it with, as they benefit so much from being a large unit. About the only thing they lose by getting big is access to transports.

Deus
23-11-2009, 12:09
Lord, I've reviewed you're statistics, and I won't dispute the conclusion, but I will dispute the methods. Because we do not know the distribution of win/draw/loss, a Chi-squared statistic would not be appropriate, in this case we would need a Non-parametric test, such as the Wilcoxon Signed-Rank test as it is free of distributional assumptions.

Interesting though; A large randomized sample would be a joy to analyze. If you happen to come across some data I would love to be privy to it.

Rant
23-11-2009, 13:00
All this data does not take into account some very big factors as well:

Player Skill is without a doubt a very important variable Luck also plays a high point to. Sometimes the dice are on, and sometimes they're off.

In order to get a viable 'scientific' measure of -anything- you need to run a controlled condition test not just random stuff taking from a message board/forum.

You need to design the armies based on what you're wanting to test. You need to take those two armies and those two armies only and play each individual mission with absolutely neutral terrain (Aka same cover amount for both sides of the board at same distance from each other and objectives etc) and then find two players of absolutely equal skill.

Only -then- will you even be able to -begin- collecting semi-accurate data, and you'd have to collect enough data so that the 'random chance' factor is lessened or nullified. Until you do those things, then any 'scientific proof' isn't really scientific proof at all.

The wonderful thing about Math is it can be made to support all sorts of stuff, even stuff that's not true can be made to seem true. You have to be able to account for everything, or the study is invalid.

Skyth
23-11-2009, 13:25
Player skill and terrain would average out in any decent sized sample size.

Deetwo
23-11-2009, 14:29
Player skill and terrain would average out in any decent sized sample size.

Yeah.

Unfortunately even a decent sample size cannot be achieved within the confines of Warseer battle reports.

Lord Solar Plexus
23-11-2009, 15:00
All this data does not take into account some very big factors as well:

Player Skill is without a doubt a very important variable Luck also plays a high point to. Sometimes the dice are on, and sometimes they're off.

This has already been discussed to the death. There is no reason to assume that either the high- or low-KP armies are favoured by luck or not, and there is no reason to believe that one or the other is favoured by better players.

At the end of the day, there is no evidence to support the assumption that there would be some kind of balance across missions - you're welcome to gather reliable data, sift through, sort and analyze it and to present the results here. Oh, and as I said elsewhere, even if that would balance out it would be a ridiculous way to design a game.

You have to be able to account for everything, or the study is invalid.

Nonsense. That is physically impossible.

Lord Inquisitor
23-11-2009, 15:50
In 4th almost EVERYONE went MSU, in fact it was one of the biggest rants here on warseer.
But would they with the current codecies? Or with objective missions as they are?

Finally but perhaps most importantly, his math (Lord Inquisitor's) revolve around the algebric element of 40K. It is essentially mathhammer but on the strategic level rather than the tactical. However, the game encompases three mathematical elements, that are all required to be taken into account, when assesing the value of units, strategies and tactics. The algebraic, which mostly deals with the strategic level, KPs/unit numers, AT/anti-infantry guns amount, shooters/assaulters amount etc. The probabilistic which regards the mechanics of dice (usually identified as mathhammer) and the geometric which involves movement, range, board and terrain, and model size. This last one is the one MOST frequently ignored epecially in the net. For example sure one can have sufficient z and according to Lord Inquisitor not have a problem in OBJ missions, but what about 6 rhinos blocking his way to the objective. What happens when one transport which can be a grabber or a denier can move 12" on in case of the Eldar 36", That is a HUGE 3x difference in grabing. DE can even assault at 21"-27"+d6 range and thus essentially claim and deny even more so.

Math is good, as long as we have the full picture and not only one element, otherwise that half-truth can be more deceiving than a biased oppinion.
I noted your post - indeed, it was one of the posts that prompted me to do this little mini-project!

You have a very good point. So this is why I looked at the actual battle results and just looking at the outcome. This takes all geometric, stochastic, etc effects that you mention into account - after all, the end result is who wins!

Whatever geometric effects there are, they do not provide the larger army an apparent advantage.

If you don't like things that reduce your KP count that's fine. But then it's not the Codex's fault that your army isn't good in KP missions.

This touches on what I would say is the very worst thing about KPs. They force people into a particular army format.

Another thing I realized about the data is that you're only receiving the data from that player based on what they felt like posting. Not to say they'd necessarily lie or even fake data, but maybe they'd only post a battle report they they felt was better than the norm, or even horribly worse!
I would say this is unlikely - the posted reports are either very detailed (in which case the players must have decided to write up the report before playing) or reports from things like tournaments (in which case all games played are reported on).

Even if you were correct, is there any reason to suspect a systematic bias in favour of one mission type over the others in terms of outcome?

I don't like KPs as a mechanic either. But i accept them as an equalizer.
That's just what I set out to test. Whatever you may think of my methodology, there is no evidence that KPs "equalise" anything.

Unless you can provide any kind of evidence, even anecdotal, to back up your claim, it remains an unsubstantiated assertion.

Lord, I've reviewed you're statistics, and I won't dispute the conclusion, but I will dispute the methods. Because we do not know the distribution of win/draw/loss, a Chi-squared statistic would not be appropriate, in this case we would need a Non-parametric test, such as the Wilcoxon Signed-Rank test as it is free of distributional assumptions.
Aha, yes, this is indeed true. I'm not very familiar with nonparametric tests myself but as I said before, an ANOVA would be a much better way of testing it, and if it failed the assumptions, a rank transformation would make the data respectable.. The Chi-squared was really just a back-of-an-envelope test. If you would like to perform an appropriate nonparametric test on the data, that would certainly be appreciated.

Interesting though; A large randomized sample would be a joy to analyze. If you happen to come across some data I would love to be privy to it.
It would be fun, although I think the tournament idea would only work if we can get a tournie organiser (who would have access to all lists and results) on board.

That said, I still quite like the idea of sampling from online batreps - it gives us a huge potential dataset with lots of additional data (e.g. combat squadding, whether games ran out of time, etc) if we're willing to sift through it.

Deetwo
23-11-2009, 15:59
Whatever you may think of my methodology, there is no evidence that KPs "equalise" anything.

Well, there is no proof either way. That's not really a compelling argument.

They force people into a particular army format.

They force people to consider the risks of a particular army format, just like objectives do. In general, good lists are not in any significant disadvantage even if they have higher KP count.
Some specific armies are slightly worse off due to the age of their codex though, it's pretty obvious. But that's not KPs fault.

Skyth
23-11-2009, 16:03
However, the burden of proof is on the people making the affirmative statement (That KP's provide balance for Objective missions). Especially with the Mathematical and empirical evidence that have been provided to counter it.

Deetwo
23-11-2009, 16:10
However, the burden of proof is on the people making the affirmative statement (That KP's provide balance for Objective missions).

And not the people who state "KPs don't work", "KPs don't balance anything"?
How does that work? If you think it's right you don't need proof?

Surely you can imagine how armies would drastically change if we only had objective missions. Or VPs instead of KPs.
We would be backtracking a LOT to 4th edition MSU.

Skyth
23-11-2009, 16:24
I love how you ignore the counters that have been presented and just go with the same old arguments that have been shot down already.

Besides the fact that the Anti-KP crowd HAS provided proof.

Lord Inquisitor
23-11-2009, 16:24
Well, there is no proof either way. That's not really a compelling argument.
Proof is a strong word. However, just by eyeballing the data, clearly KP display the predicted advantage (over twice as many games are won by the smaller unit-count-army) while objective missions show roughly 50-50 split.

There's evidence to show that objectives do not balance out KPs. There is no evidence to the contrary. Regarless of my statistical support (and I'll try and make a more respectable analysis today if I have time), the position that KPs are balanced by objectives not only is unsupported, but flies in the face of available evidence.

In general, good lists are not in any significant disadvantage even if they have higher KP count.
What? How can you say this? Assuming they play against an equally good list with a smaller unit count, they are at a singificant disadvantage! Just because a disadvantage can be overcome by good tactics or luck does not mean there is no disadvantage nor that this is "okay" to have an imbalanced game.

You can make many arguments for KPs, but this is clearly bunk.

Deetwo
23-11-2009, 16:43
However, just by eyeballing the data, clearly KP display the predicted advantage (over twice as many games are won by the smaller unit-count-army) while objective missions show roughly 50-50 split.

The data is in no way comprehensive though, at best it's a decent start. As has been said, the sample size is simply WAY too small to be concrete evidence. Your confidence in it is largely misplaced.

How can you say this? Assuming they play against an equally good list with a smaller unit count, they are at a singificant disadvantage!

I can say that because I have not experienced it is a disadvantage (even though I play a semi-MSU army with 14-17 KP). I know it's purely anecdotal, but if it was as significant as you suggest there's no way everybody could not clearly see it.
And trust me, I have seen lots of 7 vs 14 KP armies against eachother.. In all three scenarios.

Skyth
23-11-2009, 16:48
Why do I get the feeling that if LI had gone through a million battle reports and gotten the same result, you would still be claiming that it wasn't comprehensive enough.

Deetwo
23-11-2009, 16:55
Why do I get the feeling that if LI had gone through a million battle reports and gotten the same result, you would still be claiming that it wasn't comprehensive enough.

Lets call it a thousand or two and it would be a significant sample size, after I definitely would conceede the results as proof.

But honestly, I'd be amazed if KPs would not average out and C&C would not see a significant increase in draws.

qwertywraith
23-11-2009, 17:27
Lets call it a thousand or two and it would be a significant sample size, after I definitely would conceede the results as proof.

But honestly, I'd be amazed if KPs would not average out and C&C would not see a significant increase in draws.

80 is a sufficient sample size for nearly any randomly sampled study, and this was randomly sampled, if imperfectly. Most of the studies in a psychology textbook I have on hand use sample sizes of under 100. Really the random sampling does largely if not entirely eliminate the effect of all variables (player skill, terrain, deployment, luck) and is used in practically every study (experimental or otherwise).

Using a sample size of over a thousand would increase the validity of the study but is not necessary. Actually, a better way to increase the validity of this study would be to repeat the study in another context (say a major tournament that played 3 games used 1 of each mission) even if the sample size is 80 or smaller if the results match (within reasonable deviation) Lord I's initial conclusions then we have increased validity.

Now, Lord I's study doesn't prove anything because correlation is not causation, but it does suggest a link. I think repeating the survey with a different sample (from a tournament, another site, etc) to strengthen the results is the best way to proceed, and then attempting to formulate an experiment that would eliminate other possible correlative variables (perhaps low unit count armies do better because the codices that favour small unit count armies are overpowered). There is no need for a "control" group in a correlative or survey study because this is not an experiment. There is no manipulation of variables. A control group of games played between armies with equal numbers of units adds nothing because it will always be an even split of wins/loses/draws. The only thing this would add would be evidence that a given army list is better than another one (which would be useful for a future study attempting to eliminate other variables).

lanrak
23-11-2009, 17:47
Hi all.
I belive it is the manufactureres responcibility to prove thier products suitablility for perpouse.
If a game is to be decided by 'points' then how these points are derived should be proven to ballance the game , (to within acceptable peramiters.)

As GW appear to propose that 'enjoying playing the game' should be of paramount importance.
Yet fail to assign ANY importance to game play or game ballance in thier game development of 40k!
I belive this to be rather hypocritical!

I do NOT claim to know how to perform the mathematical logic to prove of disprove if anything in 40k is sufficiantly ballanced.

However the 40k rule set has been overcomplicated with unecissary special rules , purley in an attempt to justify overpricing minatures.
And the result of this overcomplication is that the dev team have lost control of the game .
They can ONLY offer opinions on thier own game system.
And as oppinion is the common justification of 40k.
A reasoned logical argument that can be backed up with sound mathematics, seems to be treated no better than opinion by some.:angel:

Thane Games use a PV calculation method that has been accepted by the players to give suitably ballanced games.And so players can use ANY configuration of the thousands of combinations avalable to get a ballanced armies to play with.

Other games with a more narrative bias, just list example senarios or mission driven games using mission cards etc.

TTFN.
Lanrak.

Deetwo
23-11-2009, 18:26
Yet fail to assign ANY importance to game play or game ballance in thier game development of 40k!

And yet, the vast majority of people I know around in tournaments and such regard 5th edition to be the most enjoyable and balanced 40k has ever been.. With the exception of C&C Spearhead, which makes mainly very boring games.
I very much agree.

Go figure...

However the 40k rule set has been overcomplicated with unecissary special rules

I take it you have no idea what 40k has been in the past then...

Y'he Sha'is
23-11-2009, 20:16
80 is a sufficient sample size for nearly any randomly sampled study, and this was randomly sampled, if imperfectly.

I think this is pretty cool. It's a relatively interesting analysis of KPs. From the very limited amount of games you've surveyed, the results are quite nifty.

However, we must of majored in very different statistical analysis/mathematics. With a variable of 9 different options of gameplay (the 3 missions with the 3 deployment options, assuming your holding the points values to a constant, as would be required for this type of study, eliminating as many variables as realistically possible), you would argue that 80 would be a viable sample size to make generalized conclusions from?

I disagree, especially considering you are drawing strictly from a single forum (limited pool), and all over the board on points (in my opinion a huge variable). Polling after the presidential election is notoriously wrong because of the singular type of people who answer. Perhaps, taking 900 random samples from 5 or so forums (based on rough 100 of each type of game) would provide a true broad range of games & reporting?

All that being said, I think it will show pretty much the same thing, and your analysis of this sample is quite well done, but it would be better statistical analysis with a larger, more refined pool. Arguably, this could be seen as an analysis with a forgone conclusion (see "How to Lie with Statistics", specifically the chapter on constructed pools if you want some light reading). You've done the first 1/10th of a good project, but it by no means is comprehensive (but I still think it will prove correct).

Lord Inquisitor
23-11-2009, 20:44
The data is in no way comprehensive though, at best it's a decent start. As has been said, the sample size is simply WAY too small to be concrete evidence. Your confidence in it is largely misplaced.
Again, small sample size only increases the chance of a type II error. Power depends also on the variance of the system, low variance means you can get away with a small sample size.

You can say that my sample size is insufficient to detect a difference for C&C or sieze ground missions, but it was patently enough to detect a difference for annihilation missions because I did detect a difference. And even if there were a hidden advantage that is significant for sieze ground missions, it is clearly not equal in magnitude to the KP advantage.

I can say that because I have not experienced it is a disadvantage (even though I play a semi-MSU army with 14-17 KP). I know it's purely anecdotal, but if it was as significant as you suggest there's no way everybody could not clearly see it.
Depends. If you're far-and-away the best player in your group, then you wouldn't notice. But are you really saying you've never felt the pain of going up against a 7KP Nob biker or wolf guard terminator army with your 17KP? It doesn't make the game a forgone conclusion, but it is clearly very skewed.

And yet, the vast majority of people I know around in tournaments and such regard 5th edition to be the most enjoyable and balanced 40k has ever been.. With the exception of C&C Spearhead, which makes mainly very boring games.
The game does not need to be balanced to be enjoyable. I personally would agree that 5th ed is the most enjoyable 40K has ever been, but that doesn't make it perfect.

Kill Points have merits, but they are neither balanced nor a balancer.

However, we must of majored in very different statistical analysis/mathematics. With a variable of 9 different options of gameplay (the 3 missions with the 3 deployment options, assuming your holding the points values to a constant, as would be required for this type of study, eliminating as many variables as realistically possible), you would argue that 80 would be a viable sample size to make generalized conclusions from?
Ah, but deployment type is independent from mission (can't get more independent than a random selection!), so there are still only three missions. The other variable doesn't affect it.

I disagree, especially considering you are drawing strictly from a single forum (limited pool), and all over the board on points (in my opinion a huge variable). Polling after the presidential election is notoriously wrong because of the singular type of people who answer. Perhaps, taking 900 random samples from 5 or so forums (based on rough 100 of each type of game) would provide a true broad range of games & reporting?
It would indeed. Are you volunteering? :p

That would of course give us more confidence in the results. The question is, is there really any doubt? I already have a significant result.

Meriwether
23-11-2009, 21:18
You can say that my sample size is insufficient to detect a difference for C&C or sieze ground missions, but it was patently enough to detect a difference for annihilation missions because I did detect a difference.

I'm sorry, but that's an unbelievably preposterous statement. As in, I literally am aghast that you made it.

If I've only ever met two chinese people and they're both very tall, can I then claim that that's an adequate sample size for determining a difference between chinese people and American people, because I did detect a difference?

If I roll a pair of dice ten times and I get boxcars three times, can I then conclude that the dice are loaded -- and confidently assert that the sample size was large enough to determine this -- because I have detected a difference between these dice and the 'expected'?

I am quite literally stunned.

Deetwo
23-11-2009, 21:36
Depends. If you're far-and-away the best player in your group, then you wouldn't notice. But are you really saying you've never felt the pain of going up against a 7KP Nob biker or wolf guard terminator army with your 17KP? It doesn't make the game a forgone conclusion, but it is clearly very skewed.

I don't actually have a gaming group, I play almost exclusively tournament games.

I'm saying that I've gone up against plenty of durable low KP armies and NOT felt the pain regardless.. As I said, this has been my experience so far in 5th edition.
Extremely low KP armies are just far easier to Wipeout, they have significantly lower firepower generally and taking out even one unit makes a massive dent in their effectiveness, especially if it's their "deathstar".

But then again, maybe it's just a different enviroment alltogether.

Skyth
23-11-2009, 21:37
There were quite a few more samples than the 2 or 10 that you mentioned.

If you roll the dice 80 times and they come up box cars 20 times, there is a very good probability that there is an issue with the dice. If you go to China and pick out 80 people at random, each from seperate town/villages/etc to find thier heights, there is a very good probability that you can derive the average height of a Chinese person. If you analyze 80 random games of Warhammer for number of kill points vs who wins, you have a very good probability of...

Deetwo
23-11-2009, 21:45
If you roll the dice 80 times and they come up box cars 20 times, there is a very good probability that there is an issue with the dice.

there is a very good probability that you can derive the average height of a Chinese person.

Or that it's a statistical anomaly, which will average out with a sample size of appropriate volume.

Meriwether
23-11-2009, 21:53
There were quite a few more samples than the 2 or 10 that you mentioned.

Of course there are -- the exaggeration was there to prove a point.

Using the fact that a difference was detected is not a credible argument that the sample size was adequate.

In my "Probability" lab, we often don't even start getting reasonable bell curves on 2d6 until we hit around 300 rolls. Indeed, some years the distribution looks downright Laplacian in the 300-400 roll range, and we need a thousand or more to correct it.

I would argue rather vehemently that 40K games are rather more complex than 2d6 rolls. LI's insistence that the variables are "independent" and thus don't do anything to the confidence of the conclusion is entirely unconvincing. The statement that the sample size is adequate simply because it detected a difference is not only unconvincing, it's absolutely ludicrous.

qwertywraith
23-11-2009, 22:30
Polling after the presidential election is notoriously wrong because of the singular type of people who answer. Perhaps, taking 900 random samples from 5 or so forums (based on rough 100 of each type of game) would provide a true broad range of games & reporting?

Yet polling before the presidential election accurately predicts the results within a few percentage points based on about 1000 respondents. How can such a small sample size account for the entire population of America? The answer, random sampling tends to even out all variables of race, religion, age, gender, and party affiliation.

Of course there are -- the exaggeration was there to prove a point.

But it doesn't prove a point. 2 is a small number and obviously doesn't work well for random sampling. Your point about thousands or rolls for 2D6 is fine for what you do, but isn't necessary here. Where then do we draw the line? Well 2 is obviously too many and several thousand is more than we need. With every report beyond the first the reliability of this report goes up.

Did you know if you randomly asked 30 people what their birthday was, odds are that 2 would be on the same day? It wont work always, but if it works 95% of the time we can make VERY strong conclusions.

I would argue rather vehemently that 40K games are rather more complex than 2d6 rolls. LI's insistence that the variables are "independent" and thus don't do anything to the confidence of the conclusion is entirely unconvincing. The statement that the sample size is adequate simply because it detected a difference is not only unconvincing, it's absolutely ludicrous.

People are also much more complex than 2D6 rolls, yet experiments and survey based studies are done all the time with sample sizes much smaller than this which draw conclusive, reliable, and repeatable results. More is better, but more is not necessary, especially if the results are repeatable. I would also note that most psychological surveys are done on limited pools (people at university), yet have far reaching consequences. As we can see from THIS THREAD, people of all ages, professions, and nationalities post here. Perhaps Warseer attracts a certain kind of person (dun dun dun) but if that's the case then an analysis of other foras battle reports, and battle reports from a tournament would strengthen Lord I's hypothesis.

As for the mission variables, controlling them would improve the strength of this analysis but it still isn't necessary, as aberrations like a 400 point game involving 10 and 12 KP per side would be evened out by the analysis of 80 games. Now perhaps a future experiment or study could analyse the advantages and disadvantages of the various missions/deployment types at individual points levels. Who knows, perhaps armies with a lot of KP are not disadvantaged at 1000 points, or 2500 points.

I'm sure that Lord I can get some tournament results in the near future which will be useful. Maybe some people who have GW stores nearby can go there and record the relevant information from the 40K game night for a few weeks.

Bunnahabhain
23-11-2009, 22:41
Yet polling before the presidential election accurately predicts the results within a few percentage points based on about 1000 respondents. How can such a small sample size account for the entire population of America? The answer, random sampling tends to even out all variables of race, religion, age, gender, and party affiliation.

You think serious political polling uses random samples? No. Any assertion that it does discredits the rest of your statements

It uses semi random sampling, withing defined groups, who are partly self selecting, and partly selected by the pollers, with intensive analysis to ensure as far as possible, the responses are weighted and biased to reflect the population, and the various groups and factions within it.
Billions of dollars, and a vast amount of high calibre intellectual effort go into this.

We're not quite on that scale...

Meriwether
23-11-2009, 23:38
Yet polling before the presidential election accurately predicts the results within a few percentage points based on about 1000 respondents. How can such a small sample size account for the entire population of America? The answer, random sampling tends to even out all variables of race, religion, age, gender, and party affiliation.

As Bunnahabhain stated, all this really does is illustrate that you don't really know what you're talking about. (Or at the very least, that you are willing to make statements founding on nothing but confident ignorance and try to pass them off as truth). Sloppy, man.

(One of the largest polling companies in the world -- Harris Interactive -- is right in my back yard, and many a friend has worked there. They employ a whole hell of a lot of people to ensure that those polls are *not* random, and that they are *not* leaving an evening out of the variables to chance.)

Where then do we draw the line?

That's not a trivial, nor a rhetorical question. Work would have to be done to determine both validity and reliability of the measurements taken -- a lot of work that hasn't been done.

Did you know...

Yes.

but if that's the case then an analysis of other foras battle reports, and battle reports from a tournament would strengthen Lord I's hypothesis.

Bah. They would strengthen his *methodology*. We wouldn't know what they would do to his hypothesis until we analyzed them.

That's a big part of my point... What he did was a good start, but it's fundamentally flawed in a whole heaping lot of ways, and a lot would need to be done to iron out all the nitty gritties before a scientific conclusion could be made on the matter.

qwertywraith
23-11-2009, 23:38
You think serious political polling uses random samples? No. Any assertion that it does discredits the rest of your statements

It uses semi random sampling, withing defined groups, who are partly self selecting, and partly selected by the pollers, with intensive analysis to ensure as far as possible, the responses are weighted and biased to reflect the population, and the various groups and factions within it.
Billions of dollars, and a vast amount of high calibre intellectual effort go into this.

We're not quite on that scale...

For modern political polls that's true, but random sampling of even small numbers of people works well enough for scientific results. The "correcting measures" of modern polling companies (which take into account race etc) may help correct for variables in small samples, but still are not necessary for largely accurate results provided the sample size is large enough and truly random. The first Gallup poll used random sampling successfully with only a . While the Literary Digest polled millions of people and incorrectly predicted the landslide defeat of FDR (largely because most of FDRs supporters didn't have telephones, so the random sampling was flawed).

And I maintain that small sample sizes are all that are necessary for most scientific studies performed today. Yes, larger studies are done and require much more money, but often the simplest methods are the best. As for the people involved, they're smart, but also well trained. Lord I seems to have a handle on this, and even if it's flawed, it's a start.

As for me being wrong about political polls. That is probably the case, and I am willing to concede the point, but it's illogical to believe that 1 error means everything is wrong. That does not follow, but I have seen many statements on warseer to the effect of "this is wrong, so everything else you say is also wrong." Actually, usually it's phrased more condescendingly. In any case, people can be wrong about one thing, or make a mistake, and still have "credit".

(One of the largest polling companies in the world -- Harris Interactive -- is right in my back yard, and many a friend has worked there. They employ a whole hell of a lot of people to ensure that those polls are *not* random, and that they are *not* leaving an evening out of the variables to chance.)

But you didn't work there?

Political polling has billions of dollars because it is a billion dollar industry, and every polling company uses different methods. Many of these methods are not scientific, and whenever you manipulate raw data to try and get the result you want you threaten the validity of your results. I'm not saying it doesn't work, but it isn't scientific.

That's not a trivial, nor a rhetorical question. Work would have to be done to determine both validity and reliability of the measurements taken -- a lot of work that hasn't been done.

You're right, it isn't rhetorical. The answer is: a sample size large enough to provide results repeatable results 95% of the time. I do not believe the sample size need be very large. As for the other variables, I don't think they are as big a problem as you think. Yes game size, # of objectives in seize ground, etc. are all variables that it would be nice to control for, but I don't think for a quick and dirty archive based research project that they are skewing the data too much. Maybe it skews it beyond 95% repeatability, but Lord I may not be very far from the mark. You may be right about the reliability of the measurements (as inaccurate or even false reports may be posted). As for validity, the data does measure win/loss ratios based on the disparity of KP. Other variables may be present, but this is the primary variable being analyzed.

Lord Inquisitor
23-11-2009, 23:49
If I've only ever met two chinese people and they're both very tall, can I then claim that that's an adequate sample size for determining a difference between chinese people and American people, because I did detect a difference?
But have you done a statistical analysis on this?

I think we've crossed wires somewhere here, it is a matter of power. If you have a statistically significant result (assuming your test meets all of the appropriate statistical assumptions, which I freely admit mine doesn't), then you have had sufficient power to detect a difference. By definition! Otherwise what's the point in doing the analysis?

If your sample size is two, you are not going to have enough power to detect a difference.

The power calculations (at least post-hoc) only come into play if you do NOT have a statistically significant result - either this is because there really isn't any difference, or because you didn't have enough power to detect the difference (due to insufficient sample size, too much variation or too small a minimum detectable difference).

I really think you've misunderstood me here somewhere. I'm talking about statistical power, not extrapolating wildly from a small sample size.

I am quite literally stunned.
Literally? I think you need to stop facepalming yourself so hard :D;)

I don't actually have a gaming group, I play almost exclusively tournament games.

I'm saying that I've gone up against plenty of durable low KP armies and NOT felt the pain regardless.. As I said, this has been my experience so far in 5th edition.
Extremely low KP armies are just far easier to Wipeout, they have significantly lower firepower generally and taking out even one unit makes a massive dent in their effectiveness, especially if it's their "deathstar".

But then again, maybe it's just a different enviroment alltogether.
Well fair enough then. I too play almost exclusively at tournies these days, but I still see the KP-differential issue cropping up from time to time. Of course, at tournies any competitive player worth his salt has optimised his army for KP missions, which is something we agree on.

Using the fact that a difference was detected is not a credible argument that the sample size was adequate.
A significant difference. I'll see if I can dig out my stats textbook when I get home and quote chapter and verse to try and pursuade you on this. Remember that achieving a significant result fundamentally requires a big enough sample size to see it!

Meriwether
24-11-2009, 00:03
I have seen many statements on warseer to the effect of "this is wrong, so everything else you say is also wrong."

Our contributions here have been of the "the study is flawed, so the conclusion does not follow from the data (although it might still be right)" variety. We haven't been telling anyone to shut the hell up...

...but on the subject of credibility, making blatantly false statements *does* bring credibility into question, and will make people much less likely to take any of your statements at anything even vaguely approaching face value. We all make claims that are basically impossible to verify (like for example our identities, professions, etc.), and so a certain level of trust is necessary to have reasonable conversations. Blatantly false statements damage that trust, damage the credibility of those who make them, and make reasonable dialogue more difficult.

But have you done a statistical analysis on this?

The point was very exaggerated, LI.

Your statistics aren't sound if your methodology for collecting the data isn't sound. ...even if your math is right.

And, at least as far as I saw from your posts here, I haven't seen that you have demonstrated statistical significance on the 80 samples. That doesn't mean that you are "extrapolating wildly", it means you are expressing unearned confidence in results.

Literally? I think you need to stop facepalming yourself so hard :D;)

I actually did sit there gaping like a fish for a good five seconds or so, then re-read what you had said several times, before typing "I'm stunned".

RCgothic
24-11-2009, 02:23
Ok, so moving away from the statistical analysis for a moment, we have the data:

Larger army, W/L/D

In seize: 7/7/4
In Capture and Control: 7/9/8
In Annihilation: 11/24/3

In each case the expected value is 50:50 win/lose, with an unknown quantity of draws. Just by eyeballing the data, there is no statistically significant deviation from perfect balance for Seize or C&C within this sample. (ie no good reason to assume anything other than perfect balance).

The questions that remain then:

Is the Annihilation result, 11/24/3, statistically significant when we were expecting an equal number of wins/losses in a perfectly balanced system?
Yes, it shows very strong significance.

How confident are we in this result?
Reasonably. It agrees with theory, but we don't know if the sample size is large enough, or suitably randomly selected. The sample size can be pretty low if you're prepared to accept +/-10% with 80% confidence. Assuming perfectly random sampling and independance of other variables of course.

Gornak
24-11-2009, 05:51
Interesting approach, Lord Inquisitor. Increasing the sample size could make it very interesting - perhaps a team of volunteers (ie us), each collecting data from a number of websites (or other sources)?

I agree that the point you make appears solid. One curious aspect of the data not yet remarked upon is the distribution of missions:

SG - 18
C&C - 24
Ann - 38

If you like, a chi-squared test assuming an equal number of each missions should arise by chance returns p = 0.019.

Whether the discrepancy is a reporting bias, a sampling bias, or even a gaming bias has yet to be determined, but if your method of random sampling hasn't excluded more SG and C&C missions than Annihilation, why are more people playing or reporting this mission than the others?

I would also be interested to see a more detailed breakdown of 'larger armies'; one more unit or three times as many? I would expect differences to become more pronounced at higher discrepancies, but maybe this is too complicated for the amount of data that we would be able to collect.

Damn! I've been lurking for years, and now I've blown it! :)

CthulhuDalek
24-11-2009, 06:19
Congratulations Gornak on your first post!

You also make a really good point.

I think it might be useful to also set a standard points level and use a handful of armies that can be representated well in the analysis (My candidates would be 5th ed themed armies, Guard, SM, SW, Orks, and even Chaos), also considering that those armies currently can provide the most diverse types of lists, imho, as well as that they include KP and Objectives within the ideology of their codex's design.

Another thing I would suggest if further studies are done, in addition to making sure that there are an equal number of reports chosen, that a second set of lists and games should be done where players discount the idea of killpoints OR objectives in list design.

Xelloss
24-11-2009, 12:53
Whether the discrepancy is a reporting bias, a sampling bias, or even a gaming bias has yet to be determined, but if your method of random sampling hasn't excluded more SG and C&C missions than Annihilation, why are more people playing or reporting this mission than the others?
A common thing in my GWLGS is players not bothering throwing dices and directly playing annihilation (but there are a lot of young players there, maybe it's not the norm).

MVBrandt
25-11-2009, 14:16
The fact that so many people are so tied to a stubborn position that they'll explain away the stats presented as a "statistical anomaly" is kinda silly. Occam's Razor.

Deetwo
25-11-2009, 14:27
The fact that so many people are so tied to a stubborn position that they'll explain away the stats presented as a "statistical anomaly" is kinda silly. Occam's Razor.

Actually, the possibility of a statistical anomaly.
The conclusion might be right for all we know, but the fact is that we don't know for sure... And even then it might also only be right in a very specific context, even though the purpose is to make a blanket statement.

MVBrandt
25-11-2009, 14:32
I will again reiterate: Occam's Razor.
When someone takes the time to go through 80 data samples, and come to a fairly legitimate conclusion based on those samples, running it off as a statistical anomaly is foolish at worst, and more likely rude.

If you want to counter analyze the stats, or come up with an analysis of 80 OTHER test samples, go for it. Just don't crash on someone else's decent effort using a fairly sizable sample realistically speaking.

Again, harkening back to someone's example of rolling a pair of dice 80 times, and rolling boxcars 20 of those times. The truth is those dice are probably bad dice ... something is wrong with them weight-wise. Going "OR IT'S AN ANOMALYZZZZZ" is intended to either blow off a real argument, or is just plain stupid.

Hellooo, Genarro. You're at a DINOSAUR PARK, you're hearing BIG THUDS ... and yet you think maybe it's the power trying to come back on ...

Deetwo
25-11-2009, 16:38
When someone takes the time to go through 80 data samples, and come to a fairly legitimate conclusion based on those samples, running it off as a statistical anomaly is foolish at worst, and more likely rude.

And simply because somebody took the effort makes the results valid? :D
It's the legitimacy of reaching a conclusion (of any confidence) based on a fairly small sample size that is under question here.

Again, harkening back to someone's example of rolling a pair of dice 80 times, and rolling boxcars 20 of those times. The truth is those dice are probably bad dice ... something is wrong with them weight-wise.

Rolling dice 80 times is nowhere near enough to produce any realistic averages.

If you want to counter analyze the stats, or come up with an analysis of 80 OTHER test samples, go for it. Just don't crash on someone else's decent effort using a fairly sizable sample realistically speaking.

Looks like I might have to do just that at some point... I AM actually very interested in concrete proof in this matter.

viking657
25-11-2009, 16:58
Its all very interesting but all I know is that when I play my Guard on kill points unless I wipe out the other small elite army I can't win even more so when its dawn of war because I often lose a turn of heavy shooting

Meriwether
25-11-2009, 17:29
I will again reiterate: Occam's Razor.
When someone takes the time to go through 80 data samples, and come to a fairly legitimate conclusion based on those samples, running it off as a statistical anomaly is foolish at worst, and more likely rude.

You clearly have no experience with peer review. Welcome to grown-up science, kid.

Skyth
25-11-2009, 17:41
Rolling dice 80 times is nowhere near enough to produce any realistic averages.
.

Sure it is...Especially when what you get by those rolls is WAY off of what you should (20 results of a 1 in 36 chance out of 80 rolls).

If you are looking at the number of results from 2-5, 6-8, and 9-12 to see if it's balanced, 80 rolls should be sufficient to see if there's a pattern.

CthulhuDalek
25-11-2009, 18:37
Everyone should take another look at Gornak's point.

Post #91.

*Edit*in a way it could imply that there are more people posting how they won in killpoints with their elite armies, because it was such an unusual occurrence, or "strange feat" that they'd post it out of incredulity! Or the opposite, that they'd only post their losses when it was a rare occurrence, etc.

Lord Inquisitor
25-11-2009, 18:54
Your statistics aren't sound if your methodology for collecting the data isn't sound. ...even if your math is right.
You're moving the goalposts here Meri.

Assuming that there is no systematic bias in my samples, my statement is entirely correct. Deetwo has been criticising my sample size - again, insufficient sample size only increases the chance of a Type II error, which is why you typically do a post-hoc power analysis after getting a non-significant result.

That's all I said, I really don't understand your horror. Now, if you wish to pick apart my methodology that's entirely another matter, but I was responding to the "sample size is insufficient to draw conclusions" argument. If you have a significant result, then the sample size has been taken into account in the calculation, and it was, by definition, sufficient for that given alpha level. This is a fundamental property of parametric statistical analyses.

And, at least as far as I saw from your posts here, I haven't seen that you have demonstrated statistical significance on the 80 samples. That doesn't mean that you are "extrapolating wildly", it means you are expressing unearned confidence in results.
You have not explained why there is an issue in any kind of logical way.

"Uncontrolled variables/sources of error"
Providing there is no systematic bias these only serve to increase the variance of the system - which increases the Type II error risk, but again, if you get a statistically significant result this should be valid.

Let me give you an example from my stats book, "Biostatistical analysis"
Example 10.1 talks about an experiment with ninteen pigs divided into four groups fed differing food, and trying to determine if there is a difference in the average weight between the four groups.

Now, in the discussion on power:

p.192 "One may also strive to increase power by decreasing S^2, which may be possible by using experimental subjects that are more homogenous. For instance, if the ninteen pigs uses in Example 10.1 were not all of hte same age and breed, and not all maintained at the same temperature, there might well be more weight variability within the four dietary groups than if all ninteen were the same in all respects except diet."

The point I wish to make is that, while controlling for these variables increases your power, it isn't required to make a valid conclusion. If you can get a significant result, this is entirely valid without controlling for extraneous variables.

Now, if there are systematic biases or unindependent variables, then this IS a serious problem. The only one anyone has identified has been Gornak (which I will respond to below).

"Lack of a control"

Fair point. I will endevour to add a control group to the data, although I don't think this invalidates the study as it compares each group to the expected value.

"Causal vs correlative data"

Fine, it makes no difference to my argument if we phrase it as a correlation rather than a causal effect. Nevertheless, I have made a causal prediction and the data (if correlative) matches that prediction.

"Insufficient sample size"

Only increases Type II error, is not a reason by itself to invalidate a significant result.

Now, I want to convince you, so what I'll do is reformat the data, add a control, remove any potential biases (i.e. any non-random mission reports, see below) and redo the anlyses as an ANOVA or a series of T-tests, either way I should be able to do a power analysis on. Would that help to pursuade you?

Interesting approach, Lord Inquisitor. Increasing the sample size could make it very interesting - perhaps a team of volunteers (ie us), each collecting data from a number of websites (or other sources)?
Yes, if anyone would be willing to try and collect information from other websites, that would help increase the sample size. What I was thinking was taking the data in the same manner that I did from other forums (Dakkadakka or wherever) and collecting ~5 pages from each (or as many as you can be bothered). I'll collate the data and use each forum as a data point for an ANOVA analysis. Although I would say we'd need at least 5 forums.

I agree that the point you make appears solid. One curious aspect of the data not yet remarked upon is the distribution of missions:

SG - 18
C&C - 24
Ann - 38

If you like, a chi-squared test assuming an equal number of each missions should arise by chance returns p = 0.019.

Whether the discrepancy is a reporting bias, a sampling bias, or even a gaming bias has yet to be determined, but if your method of random sampling hasn't excluded more SG and C&C missions than Annihilation, why are more people playing or reporting this mission than the others?
You are right! I noted the disproportionate number of annihilation missions, but I hadn't considered the implications - which of course, really throws a spanner in the works.

Bizarre, of course, because one presumes that mission is selected by dice roll. This is unlikely however! What I'll do is go through and toss out any data where the report doesn't make it clear either explicitly or implicitly that mission is determined by dice roll.

I would also be interested to see a more detailed breakdown of 'larger armies'; one more unit or three times as many? I would expect differences to become more pronounced at higher discrepancies, but maybe this is too complicated for the amount of data that we would be able to collect.
This occurred to me, and that data is in the original data file I collected. However, I don't have a clue how to analyse a binomial (win/loss) against a discrete variable (unit count differential). If anyone can do such an analysis on the original data, I'd love to see it!

I can provide some raw data to eyeball:

If we exclude all games that have less than a 10% differential in unit count:

Games W/L/D with regard to the smaller player:
Sieze ground: 3/6/4
C & Control: 8/6/8
Annihilation: 18/9/3

If we exclude all games that have less than a 20% differential in unit count:

Games W/L/D with regard to the smaller player:
Sieze ground: 2/2/1
C & Control: 7/2/7
Annihilation: 11/4/2

If we exclude all games that have less than a 30% differential in unit count:

Games W/L/D with regard to the smaller player:
Sieze ground: 0/2/1
C & Control: 5/2/4
Annihilation: 6/2/0

Now, I'm not going to make any sweeping conclusions from this as the sample size does get extremely small but there's the expected trend for annihilation missions (the proportional increase in games won increases as the differential increases). Interestingly, C&C appears to have a consistent advantage for the larger army, but obviously the sample size is too small to know for sure.

Damn! I've been lurking for years, and now I've blown it! :)
Ha ha! Well it was the best 'first post' I've seen in a long time!

A common thing in my GWLGS is players not bothering throwing dices and directly playing annihilation (but there are a lot of young players there, maybe it's not the norm).
This seems a very reasonable hypothesis.

Skyth
25-11-2009, 20:16
What about going second in objective missions with a larger army?

Deetwo
25-11-2009, 21:00
I went ahead and did a bit of reading through battle reports as well.. Got numbers for 40, using the same conditions you did.

Seize Ground:
Smaller win - 4
Larger win - 3
Draw - 3

Capture and Control:
Smaller win - 3
Larger win -3
Draw - 5

Annihilation:
Smaller win - 6
Larger win - 13
Draw - 0

Trust them or not :D
I'll just say I did this in good faith anyway.

It's kind of weird how much more Annihilation games there seems to be...
Must be a factor that is not accounted for, considering all three missions have the same chance of coming up.

Lord Inquisitor
25-11-2009, 21:53
That's ... suprising, to say the least, if that's unbiased data.

Can you post the source data as a tab delineated text file (assuming you compiled it as an excel sheet) and can you give us info on where you collected this data from?

Zeroth
Meriwether
25-11-2009, 22:11
You're moving the goalposts here Meri.

Well, yes and no. I have many, many objections to this "study". My point here was that even if your math is right and it is showing some kind of statistically significant variation, you still can't claim that you've shown anything relevant until you've looked at all of the other possible flaws and accounted for them.

That's all I said, I really don't understand your horror.

I realize that. That's part of the horror.

You have not explained why there is an issue in any kind of logical way.

No, not strongly. I was naming some objections without fully explaining them.

You either have, or do not know whether or not you have, all kinds of errors (what you appear to be calling 'biases', although I would call a bias a 'systematic error', whereas you may also have some number of 'random errors' as well) in this study. Case in point:

You are right! I noted the disproportionate number of annihilation missions, but I hadn't considered the implications - which of course, really throws a spanner in the works.

It has already been mentioned (casually by myself and others) that there might be particular reasons why certain battle reports were posted. You have no idea how representative your data set is of 'typical' 40K games. No idea whatsoever if you are seeing the ordinary or the fantastic. All that can be said of these games is that people thought them of sufficient note to post.

Even controlling your data set to games where the mission was clearly chosen by a dice roll, while a step in the right direction, isn't nearly enough to clean up the potential biases in your data sets.

I am quite certain that, with a little effort, you could find at least a dozen things like this that will strongly impact the confidence with which you can make conclusions.

Deetwo
25-11-2009, 22:33
That's ... suprising, to say the least, if that's unbiased data.

Can you post the source data as a tab delineated text file (assuming you compiled it as an excel sheet) and can you give us info on where you collected this data from?

Actually, I just took a piece of paper and a pen and went through pages 11-14 or so of the battle reports :)
I just wanted to see if the numbers would start to shift if I went a bit further from where you stopped.

There was definite trends to be seen though.. A LOT of the reports I went through had Daemons or IG in them.
And the disproportionate volume of annihilation seems to really suggest we aren't really looking at a realistic representation at all.