Luck in multiplayer

General feedback and discussion of the game.

Moderator: Forum Moderators

User avatar
iceiceice
Posts: 1056
Joined: August 23rd, 2013, 2:10 am

Re: Luck in multiplayer

Post by iceiceice »

pyrophorus wrote:Curiously, tossing a coin is exactly the abstract model this test suite uses. And of course yes, applying the test suite to a large set of coin tosses can tell you if this coin is fair or not.
I think you misunderstand the purpose of the tests.

Probability theory is a mathematical discipline that was developed many hundreds of years ago. It is very well understood. If you want to explain to a child how to determine if a coin is fair... you can do very simple tests with a pen and a piece of paper, counting heads and tails. You can explain about the law of large numbers, and how the numbers must converge. Understanding the error is something that has to wait until highschool. But this procedure has not been superseded by the tests at NIST. An adult who wanted to determine if a coin is fair would do essentially the same thing. A scientist who wanted to determine if their experimental results agree with the "null hypothesis" would do fundamentally the same thing.

Pseudorandomness is a mathematical discipline, like cryptography, that has only been conceived of and developed in the last 70 years or so. The concept of pseudorandomness is fundamentally distinct from the concept of randomness.

NSIT is a university is India. Although you have written NSIT every time, I'm going to assume that you are actually referring to the tests at the national institute of standards and technology in the United States.

http://csrc.nist.gov/groups/ST/toolkit/rng/index.html

These are merely a standard battery of tests, to screen candidate pseudorandom generators for suitability in a wide variety of applications. The website even contains an explicit note.

"These tests may be useful as a first step in determining whether or not a generator is suitable for a particular cryptographic application. However, no set of statistical tests can absolutely certify a generator as appropriate for usage in a particular application, i.e., statistical testing cannot serve as a substitute for cryptanalysis. The design and cryptanalysis of generators is outside the scope of this paper."

So this suggestion that we should use the tests to determine if a coin is fair... pretty much comes from nowhere. The tests at NIST aren't a revision or addendum to probability theory. Using the NIST tests to detect bias in a coin would be both inappropriate, and a bit like killing a fly with an atomic bomb.

If you want to properly understand pseudorandom generation, and the related issue of randomness extraction, for the purposes of using randomized algorithms, then you should do some online research or look in a textbook. I'm not going to sit here and try to write a survey of the (fascinating) field to type into the forums, as much as I like the subject.

Edit:
pyrophorus wrote: Cascading generators will not give you automatically more randomness. It's just incantation.
Here's the simplest justification I can give of the practice.

Suppose I have a generator G, which guarantees that, when given truly random input, it will generate output that no simple test can distinguish from truly random.

If I have a second such generator G2, and I seed G with a (properly seeded) instance of G2, and the result suddenly becomes flawed, in the sense that there is now a simple test that distinguishes the result from random... then in fact what I have is an algorithm to distinguish G2 from random. The algorithm is, take the output of G2, use it to seed a copy of G, then run the test. In reality, both generators are relatively simple, and such a test could surely be simplified much further. So it would mean that G2 is defective, contrary to assumptions.

This is basically a synopsis of an argument used in formal theories of pseudorandomness and cryptography. It is unsound to do this recursive seeding indefinitely, because the security parameter and error tolerance of the generator gets worse each time you do it. But doing it even a few times is going to be fine. (Edit: assuming you start with a decent generator.)

Edit: Here's another way to say it.

Random number generators are algorithms, whose purpose is to fool other algorithms. If a generator is good enough to fool generator + your algorithm, then cascading the generators is fine. If not, then it isn't. I think in our case cascading would work just fine. I don't have a formal proof in mind for this, but after thinking about it for a bit, that's my intuition. It's driven particularly by the fact that we reseed from the source generator so frequently -- reseeding at every attack means that barely anything is drawn from G on a single seed, so the pseudorandom properties of G2 dominate the results.

Even if it's informal in our case, I don't think it's fair to say it's just an unprincipled incantation. It's reasoning by analogy, and in some cases there really is rigorous justification.

Edit: Another thing. I'm not sitting here trying to prove formally that our actual random number generator system works in the sense that it successfully fools a large and explicit collection of tests, that would be a waste of my time... I'm just doing basic reality checks to try to answer questions like "will frequent reseeding cause bias in this generator", because that's something you can do in an ad hoc fashion.

But if you are doubting whether the puzzle pieces actually fit together, they do... if you want some kind of justification for why /dev/random can be assumed to be nearly random, I think this might be a historically accurate place to start... leftover hash lemma. I don't know what implementation is actually used in /dev/random / what the state of the art in practice is, but I think you can rest assured that bits you draw from /dev/random are indistinguishable from random bits in an extremely strong sense, for a desktop computer operating in a normal environment. If you drew millions and millions of samples from /dev/random, and made a histogram and it wasn't very nearly uniform... I think that represents a security vulnerability that you should report to your linux distribution.

Edit:
gnombat wrote: True, any pseudorandom number generator is going to have only a finite number of possible sequences; it just seems that the current Wesnoth implementation is unnecessarily reducing the amount of entropy in the generator (and hence the number of possible sequences) by re-seeding it. The generator itself has 32 bits of state, so it could potentially generate 2^32 possible sequences, but every time it gets re-seeded this drops to 2^15.
Strictly speaking you are of course right, and we could use larger seeds. (Although, when you say that the entropy "drops" I think that's not quite right, it's more like, it only ever has 15 bits of entropy. When you use the generator, it doesn't increase the entropy. If someone else was secretly using the same generator and didn't tell you about it, then that might increase the entropy from your perspective, if the number of times they call the generator is random.)

But I think rng defects could only be an important issue if they affect the game state. The precise order of the hits and misses seems a bit like a cosmetic detail to me.

For instance suppose wesnoth were like a game for little kids, and on the title screen there were little homes for elves and dwarves, appearing randomly, and if you click on them they say random things. If that's not fully randomly seeded... who cares?

From a purist's point of view the animation of the attack sequence is just cosmetic, the only thing that matters is the outcome... the hp of the parties. As long as that is correctly distributed I wouldn't say it's a problem. The numbers shown in the "damage calculations" window should be accurate, ideally to negligible error. The order of hits and misses can matter when one party is a few hits from death or something, but even with this consideration I don't see that more than 2^15 seed values could be strictly necessary.

Edit: To be clear, nothing has been said in this thread that indicates that those numbers are inaccurate to non-negligible error. All I'm saying is, if there are identifiable defects, we should fix them, even if we think they are small enough to be imperceptible... because there can always be other problems we don't know about that magnify them, and it's quite difficult to test this stuff.
forceOfHabit
Posts: 8
Joined: May 5th, 2014, 7:16 pm

Re: Luck in multiplayer

Post by forceOfHabit »

iceice,

I agree, a tool for evaluating luck would be roughly as hard to write as an ai, and much of the logic would be relevant to an ai. A little simpler because:
  • You don't have to deal with recruit/recall logic. There is some luck in the traits generated on recruitment, but realistically that luck factor would probably never be quantified.

    You probably don't have to solve quite as complex a pathing problem, especially in a beta version. You will eventually have to consider pathing because if you want to evaluate how "lucky" a kill/non-kill was, you have to have some measure of the value of the opportunities enabled/denied.
I'm still thinking about how I might go about this. I think for a beta I will try something along the lines of looking only at single combat between two units. Its easy to construct the possible outcomes (in terms of hp remaining, effects (like poison or slow), xp after encounter (especially relevant if unit levels up or not), anything else?), and the probabilities of each outcome. What's harder is assigning a utility to each outcome: how does the sole survivor being a poisoned Elvish Archer with 6HP for one team compare to the sole survivor being an Orc Assassin with 7HP and 32/34 xp for the other?

Anyway, I'm not really asking for an answer to that question (although any suggestions/discussion are welcome). I'm trying to move this part of the thread here

http://forums.wesnoth.org/viewtopic.php?f=10&t=40406

so I don't intrude on the prng discussion.
TheCripple
Posts: 103
Joined: March 19th, 2011, 3:30 am

Re: Luck in multiplayer

Post by TheCripple »

iceiceice wrote:Understanding the error is something that has to wait until highschool.
It's traditionally left until highschool, but it doesn't need to wait that long. I've taught math of similar difficulty to elementary students before, and they are generally able to grasp concepts just fine. Actual implementation of techniques and such, less so, but that's more lack of practice than anything.
User avatar
pyrophorus
Posts: 533
Joined: December 1st, 2010, 12:54 pm

Re: Luck in multiplayer

Post by pyrophorus »

iceiceice wrote:
pyrophorus wrote:Curiously, tossing a coin is exactly the abstract model this test suite uses. And of course yes, applying the test suite to a large set of coin tosses can tell you if this coin is fair or not.
I think you misunderstand the purpose of the tests.
Yes, of course... I'm an ignorant idiot, and this is why all this tedious teaching is needed.
<snip>
iceiceice wrote: Edit:
pyrophorus wrote: Cascading generators will not give you automatically more randomness. It's just incantation.
Here's the simplest justification I can give of the practice.

Suppose I have a generator G, which guarantees that, when given truly random input, it will generate output that no simple test can distinguish from truly random.

If I have a second such generator G2, and I seed G with a (properly seeded) instance of G2, and the result suddenly becomes flawed, in the sense that there is now a simple test that distinguishes the result from random... then in fact what I have is an algorithm to distinguish G2 from random. The algorithm is, take the output of G2, use it to seed a copy of G, then run the test. In reality, both generators are relatively simple, and such a test could surely be simplified much further. So it would mean that G2 is defective, contrary to assumptions.
This is wrong, or more accurately, you can't say anything if you reseed the generator too often. Test (simple or complex ones) only gives significant results on rather large sequences. This is what I say from the beginning, and what you admit just a little further.
iceiceice wrote: This is basically a synopsis of an argument used in formal theories of pseudorandomness and cryptography. It is unsound to do this recursive seeding indefinitely, because the security parameter and error tolerance of the generator gets worse each time you do it. But doing it even a few times is going to be fine. (Edit: assuming you start with a decent generator.)
IMO, you're focusing on the generator randomness properties as if they were enough. But if they are needed, they are not sufficient. Unwary use of a good generator can result in badly randomized sequences. I gave one example in a previous post, and it's not difficult to find others.
User avatar
iceiceice
Posts: 1056
Joined: August 23rd, 2013, 2:10 am

Re: Luck in multiplayer

Post by iceiceice »

pyrophorus wrote:Yes, of course... I'm an ignorant idiot, and this is why all this tedious teaching is needed.
You are asking technical questions that only properly have technical answers. If you don't want to hear them from me, fine, I won't waste my time.
pyrophorus wrote: This is wrong, or more accurately, you can't say anything if you reseed the generator too often.
If you think I'm wrong, you can easily prove me so by running a numerical experiment, or providing a citation. But otherwise what you have said doesn't convince me.
User avatar
pyrophorus
Posts: 533
Joined: December 1st, 2010, 12:54 pm

Re: Luck in multiplayer

Post by pyrophorus »

iceiceice wrote:
pyrophorus wrote:Yes, of course... I'm an ignorant idiot, and this is why all this tedious teaching is needed.
You are asking technical questions that only properly have technical answers. If you don't want to hear them from me, fine, I won't waste my time.
I wasn't asking anything, at least not this kind of common places about randomness.
iceiceice wrote:
pyrophorus wrote: This is wrong, or more accurately, you can't say anything if you reseed the generator too often.
If you think I'm wrong, you can easily prove me so by running a numerical experiment, or providing a citation. But otherwise what you have said doesn't convince me.
It's easy: imagine I send you the result of tossing a coin twice: it's two heads. Can you say something about this coin ? No, because there's nothing strange with this result. Actually, it's a cheated coin with two heads faces. But now if I send you the result of one hundred trials (which will obviously be one hundred heads), you will of course suspect something, because the probability to get this result with a normal coin is extremely low.
It's exactly the same problem with PRNGs: they're able to produce sequences very close to a normal distribution, but it's obvious short sequences can widely move aside. If you reseed very often, you shall obtain a lot of short sequences and nothing proves they will fit a normal distribution as would a single one of the same size. Probably, on the long run (thousands of trials), all this will converge but how fast ? You can encounter important local fluctuations, who knows ? And this is exactly about what the OP and other users are complaining: runs of bad luck. Of course, as Daravel himself says, "bad luck" is not a simple hit/miss ratio problem. But it is something like "being very improbable". The question is not only a random generation problem of course, but I don't think favouring fluctuations in the random system can improve things.
Now, this:
iceiceice wrote:If you drew millions and millions of samples from /dev/random, and made a histogram and it wasn't very nearly uniform... I think that represents a security vulnerability that you should report to your linux distribution.
suggests me you've something else in mind when speaking random, so to say no predictable or impossible to guess. In cryptography applications, the PRNG is called once or twice, and the uniformity of the distribution guarantee all values are equally probable, so no probability argument can help the adversary to guess the result. But this don't mean successive calls to the PRNG will give you a uniform sequence.
PRNG are like knifes, they can have many uses: making balanced decisions in a game and creating secret numbers have nothing in common, except they use different properties of randomness.

Friendly,

EDIT:
By the way, I stumbled on this on the site of the PRNG I use in my map generator:
"I am a student interested in RNGs for crypto applications. I stumbled upon your mt19937b PRNG on the web and I am testing it to check whether it's suitable or not to key generation. Previously I tried other PRNGs but unfortunately they weren't good enough, mainly because (I guess) of short cycles. The problems arise when you try to periodically reseed the PRNG: even if you start the key-generation process with a different seed every time, sooner or later you'll have the PRNG falling into an already-passed state eventually obtaining the same key as before... Are the short cycles to blame?. Anyway this didn't happen so far with your mt19937b generator and I would be glad to know the reason of this improvement, namely: are you sure that reseeding your PRNG with *different* seeds (32 bit) I'll get a totally different output stream every time?"
http://www.math.sci.hiroshima-u.ac.jp/~ ... /user.html
This guy seems to be less confident than you about reseeding PRNGs.
Post Reply