[Beowulf] OT: public random numbers?

Fri Aug 26 05:29:06 PDT 2011

On Fri, 26 Aug 2011, Robert G. Brown wrote:

> Let's try a bit of "paper fiddling".  The expected number of filled slots
> is (this is actual code, not pseudocode, for n slots):
>
>  nlogn = log10(n)*n;
>  expected = (n - n*pow(1.0-1.0/(1.0*n),nlogn));
>
> The reasoning is enormously simple.  The probability of a slot being
> empty after one pull is (1 - 1/n). After nlogn pulls, it is p_e = (1 -
> 1/n)^nlogn.  The probability of a slot being filled is thus 1 - p_e, and
> given n slots n - n*(1-1/n)^nlogn of them "should" be filled, within
> random noise, n*(1-1/n)^nlogn of them "should" be empty.

Silly me.  All of the anonymous slots are at least asymptotically
independent (not necessarily obvious, but true from symmetry, I think,
subject to the weak constraint that the total population of all of the
slots has to add up to the number of trials so there are probably n-1
degrees of freedom in the Pearson test).  We have p and q.  The
distribution is binomial and of course I know the binomial distribution
and its sigma.  I can easily build any one of several tests on top of
this (simple binomial or even multinomial, since I effectively have the
hit frequency for n slots and it should BE the binomial distribution),
and in fact have two or three already that are very similar to this on a
smaller scale.

It's what comes of hacking out -- sorry, "fiddling" out -- quick
solutions and tests late at night when you're tired and ought to be
sleeping.  A bit of coffee makes a world of difference...:-)

I'll have to think a bit about it and make sure that this isn't already
done, better, in e.g. STS, but it might yet see the light of day as an
actual dieharder test.

BTW, I'm not replying to your space alien ET post (to the Beowulf list
in reply to an already OT discussion of martingales that arose out of a
discussion of good RNGs and seeding strategies sorry y'all but hey, at
least it is entertaining?) simply because my jaw is sore from hitting
the ground so many times while reading it.  Those are some top-quality
hallucinogens, yes they are...

We will now return to your regularly scheduled discussion of boring
things like bandwidth, memory reliability, parallel algorithms and the
like, you know, on-topic stuff.  But if any of y'all ever need to test
rngs or flame schemes to "win" non-zero-sum games by means of
"strategy", you know who to call...;-) Somewhere upstairs I have this
nifty book on game theory and in a pinch I can even trot out an actual
game matrix and analyze outcomes algefiddlingbraically!

    rgb

P.S. -- Vincent, all of these simple problems were solved by
mathematicians and statisticians so very, very, long ago, beginning with
the work of Pascal and Fermat (there are names to conjure with, eh?)
solving the problem posed by the Chevalier de Mere regarding an even bet
on double sixes happening at least once in 24 throws: actual probability
of double sixes per throw are (of course) 1/36, probability of no double
six in 24 throws are (35/36)^24, odds of at least one are therefore 1 -
(35/36)^24 = 0.4914038761 -- all paper fiddling, mind you -- a result
that is eerily reminiscent of the solution to your problem, but with
fewer slots.  So at even odds it is -- barely -- a sucker's bet.  But a
margin of 0.86% is enough to empty even the deepest pockets, over time.

Now all you have to do is advance your actual knowledge of statistics
beyond that realized by an idly rich French nobleman in 1654 (who still
was wise enough to recognize that it wasn't an even bet and consulted
the best of the best of the minds of his day to prove it).

You have a mere 357 years to go...:-)

P.P.S -- If "all rngs" were really as bad as you assert, does it not
stand to reason that "all Monte Carlo computations" that use them would
all get egregiously incorrect results?  And yet they don't.  In fact, in
problems (like the Ising model in 2D) where known solutions exist, they
agree basically perfectly with the theoretical solution, and of course
it is easy to compare a wide range of integrals and Markov process
outcomes with theory.  So if you used your simple common sense you would
construct a mental argument like:

"Either

I, in my brilliance, have discovered an egregious flaw in all random
number generators used by all of those STUPID computer scientists,
mathematicians, and physicists for decades to do their long and complex
computations that no doubt all got equally egregiously wrong answers;

Or

Those computer scientists, mathematicians, and physicists are actually
pretty smart and aggressively check their work (and each other's work)
with a strong incentive to discover problems.  It is rather probable
that any such egregious error would have been long ago discovered;
therefore there is almost certainly a serious error in my own
reasoning."

Seriously, dude.  Ask yourself "Am I really smarter and better informed
than Pascal, Fermat, Laplace, Bayes, not to mention all of those
contemporary humans who have been devoting entire well-educated careers
to random numbers as if all of modern e-commerce depended on them (it
does) or is it just barely possible that I've made a mistake?"  Come on,
you can do it.  I know it is difficult for you, but try..

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu