Another Statistical Measure
Moderators: Forum Moderators, Developers
Forum rules
Before posting a new idea, you must read the following:
Before posting a new idea, you must read the following:
Another Statistical Measure
Hi,
I've been thinking about the statistical measure in Wesnoth that displays actual damage VS expected damage overall (taken and delivered). Generally, this is a good measure  however if it were used for a real statistical study of some kind (not being SUPER serious on that, but emphasizing the point that multiple measures are useful for a good overall analysis), I don't think it would pass: the main reason is that this gives no indication of where the damage is going or how effective it is. The simplest example of that is healing: healing units scew the data by providing total damage values that are not representative of the gold spent on units (if healing did not exist, then this might work).
My thought is that it would be interesting to have a second measure that displays information on expected kills. I'm not exactly positive how the math would work on this, but here's the idea:
1. If a unit could not die because the maximum damage of the attacking unit is unable to kill it, no information is recorded.
2. If a unit is generally expected to die (say 90%) and does, then a record of that expectation and fulfillment is made.
3. If a unit is generally expected to die and does not (failure on 90% death chance), this would shift the graph sharply in the other direction  but this is where my Statistical calculative knowledge gets fuzzy on how to do this.
Why this matters: Whereas the overall stats of a game can even out in the long run, huge strokes of luck in certain situations (killing a unit you didn't expect to or failing to kill one you expected to) affect considerably more than just the immediate damage statistics: failing to kill a unit that was reasonably expected to die, for example, can profoundly affect a battle: positioning is off, other units don't have access, the unit you thought was safe is now stranded, etc...and moreover, the attacked unit may even get away and manage to heal up and come back again. The opposite is true as well (taking out a unit I never could have imagined killing and then scoring more convenient hits because of it). None of this is seriously captures in the current statistical analysis: the proposed method would give a more global "picture" of what is actually *happening*, which is a more precise (if still imperfect) overall assessment.
I've been thinking about the statistical measure in Wesnoth that displays actual damage VS expected damage overall (taken and delivered). Generally, this is a good measure  however if it were used for a real statistical study of some kind (not being SUPER serious on that, but emphasizing the point that multiple measures are useful for a good overall analysis), I don't think it would pass: the main reason is that this gives no indication of where the damage is going or how effective it is. The simplest example of that is healing: healing units scew the data by providing total damage values that are not representative of the gold spent on units (if healing did not exist, then this might work).
My thought is that it would be interesting to have a second measure that displays information on expected kills. I'm not exactly positive how the math would work on this, but here's the idea:
1. If a unit could not die because the maximum damage of the attacking unit is unable to kill it, no information is recorded.
2. If a unit is generally expected to die (say 90%) and does, then a record of that expectation and fulfillment is made.
3. If a unit is generally expected to die and does not (failure on 90% death chance), this would shift the graph sharply in the other direction  but this is where my Statistical calculative knowledge gets fuzzy on how to do this.
Why this matters: Whereas the overall stats of a game can even out in the long run, huge strokes of luck in certain situations (killing a unit you didn't expect to or failing to kill one you expected to) affect considerably more than just the immediate damage statistics: failing to kill a unit that was reasonably expected to die, for example, can profoundly affect a battle: positioning is off, other units don't have access, the unit you thought was safe is now stranded, etc...and moreover, the attacked unit may even get away and manage to heal up and come back again. The opposite is true as well (taking out a unit I never could have imagined killing and then scoring more convenient hits because of it). None of this is seriously captures in the current statistical analysis: the proposed method would give a more global "picture" of what is actually *happening*, which is a more precise (if still imperfect) overall assessment.
Re: Another Statistical Measure
I think it might be too confusing for mainline, but if addons were given easier access to damage statistics, that might be possibility.

 Developer
 Posts: 489
 Joined: April 24th, 2016, 4:18 pm
Re: Another Statistical Measure
Sounds to me like an awful lot of work to test the RNG. Most likely the desire is to disprove the claims it is. Such an effort seems fruitless, and may even backfire. I'm not opposed to some UMC trying, but it should not go into mainline.
I forked real life and now I'm getting merge conflicts.
Re: Another Statistical Measure
I'm not sure I understand your point. Whereas in Chess the matter of skill is questionable because randomness does not exist, there is always a random element in Wesnoth, and so unless that randomness is removed, the most accurate measure possible of the RNG statistics is important in order to know precisely what happened. I don't see why accurate measurements can "backfire" or really even why this needs to be explained, frankly.Tad_Carlucci wrote: ↑January 13th, 2020, 3:26 pmSounds to me like an awful lot of work to test the RNG. Most likely the desire is to disprove the claims it is. Such an effort seems fruitless, and may even backfire. I'm not opposed to some UMC trying, but it should not go into mainline.

 Developer
 Posts: 489
 Joined: April 24th, 2016, 4:18 pm
Re: Another Statistical Measure
The issue is not technical, it's psychological. No amount of "proof" will convince someone who believes the RNG is flawed. Given the nature of randomness, you method is guarenteed to produce some highly skewed results which will be locked upon as "proof" the RNG is biased even though such resulst are, in fact, required part of the proof it is, in fact, not biased. That is the backfire ... your proof of nonbias will be taken as proof of bias.
I forked real life and now I'm getting merge conflicts.

 Posts: 435
 Joined: February 7th, 2015, 1:27 am
 Location: Germany
Re: Another Statistical Measure
I think Gidoza's suggestion wasn't another "RNG sux mimimi" or "RNG is disfunctional" thread, but rather about the statistic measuring actual to expected hit RNG in a given game, which is a good (or not so much  that's the point of contention) indicator for multiplayer games how much luck factored into that one specific match.Tad_Carlucci wrote: ↑January 13th, 2020, 6:00 pmThe issue is not technical, it's psychological. No amount of "proof" will convince someone who believes the RNG is flawed. Given the nature of randomness, you method is guarenteed to produce some highly skewed results which will be locked upon as "proof" the RNG is biased even though such resulst are, in fact, required part of the proof it is, in fact, not biased. That is the backfire ... your proof of nonbias will be taken as proof of bias.
(I think it's shown under menu > statistics or with hotkey s)
Under bloodred skies  an old man sits 
In the ruins of Carthage  contemplating prophecy.
In the ruins of Carthage  contemplating prophecy.

 Developer
 Posts: 489
 Joined: April 24th, 2016, 4:18 pm
Re: Another Statistical Measure
The point is, at some point, someone is going to see something happen calculate the odds of that happening as 1in10million, and proceed to claim that it proves something is fishy with "luck" (aka the RNG) when, instead, it proves such longshot occurances can happen and this actually disproves one way the RNG might be broken.
Put another way, to ensure such longshots never occurred we would need to break the RNG so it would no longer be random.
Put another way, to ensure such longshots never occurred we would need to break the RNG so it would no longer be random.
I forked real life and now I'm getting merge conflicts.
 Pentarctagon
 Forum Administrator
 Posts: 4126
 Joined: March 22nd, 2009, 10:50 pm
 Location: Earth (occasionally)
Re: Another Statistical Measure
I mean, that already happens occasionally anyway with people complaining about a mage missing 6 attacks in a row or the like.
99 little bugs in the code, 99 little bugs
take one down, patch it around
2,147,483,648 little bugs in the code
take one down, patch it around
2,147,483,648 little bugs in the code
Re: Another Statistical Measure
I think you should record any chance of death (0.0001% or lower but not 0) as your data won’t show the difference (of luck? / skill?) if you kill a unit at lower percentage (even if that was 0.1% chance of happening).
If there are recorded brackets for every (chance to kill) percentage, you’ll see whether it’s partially accurate if the failure rate is higher than normal for 70%  80% range.
The real problem is it won’t record any information before they’re vulnerable to die, so you can get situations like when a Peasant dies to a 1 HP Lich.... basically it won’t highlight the "details" without additional info.
There are strategies that use death to win, with this in consideration, information can’t be always “accurate” if strategies or tactics contain a mix of saving and sacrificing units.
I think it may need other supporting/additional features, because it seems incomplete and it can be nice to see this information somewhere.
Re: Another Statistical Measure
In seriousness, I don't understand your point. You can't claim 1in10million chances and RNG broken if you have more data. On the other hand, if a 1in10million chance of luck occurs in my favour, I at least have enough honesty as the Victor to declare that if it had been a game of Chess in an otherwise similar situation, I would not have won.Tad_Carlucci wrote: ↑January 13th, 2020, 10:08 pmThe point is, at some point, someone is going to see something happen calculate the odds of that happening as 1in10million, and proceed to claim that it proves something is fishy with "luck" (aka the RNG) when, instead, it proves such longshot occurances can happen and this actually disproves one way the RNG might be broken.
Put another way, to ensure such longshots never occurred we would need to break the RNG so it would no longer be random.
Of course the stats are likely in most situations to balance out; of course small deviations will give many an excuse to cry foul. But since you bring it up, I can't blame anyone who would want a Chesslike Wesnoth without any probability at all.
Re: Another Statistical Measure
I appreciate this rebuttal and you make a particularly good point with the intentional death strategy. In principle one could say that you want to measure a player's intentions, but measuring intentions is a little...difficult...Mawmoocn wrote: ↑January 14th, 2020, 11:43 amI think you should record any chance of death (0.0001% or lower but not 0) as your data won’t show the difference (of luck? / skill?) if you kill a unit at lower percentage (even if that was 0.1% chance of happening).
If there are recorded brackets for every (chance to kill) percentage, you’ll see whether it’s partially accurate if the failure rate is higher than normal for 70%  80% range.
The real problem is it won’t record any information before they’re vulnerable to die, so you can get situations like when a Peasant dies to a 1 HP Lich.... basically it won’t highlight the "details" without additional info.
There are strategies that use death to win, with this in consideration, information can’t be always “accurate” if strategies or tactics contain a mix of saving and sacrificing units.
I think it may need other supporting/additional features, because it seems incomplete and it can be nice to see this information somewhere.