[Beowulf] OT: public random numbers?

Fri Aug 26 09:53:14 PDT 2011

On Aug 26, 2011, at 2:57 PM, Robert G. Brown wrote:

> On Fri, 26 Aug 2011, Vincent Diepeveen wrote:
>
>> EVERY PROGRAMMER IS DOING THIS TO USE RANDOM NUMBERS IN THEIR  
>> PROGRAM.
>
> Bullshit.  "Every programmer" isn't dumb as a post.  Or wasn't my
> argument clear enough?  Do you need me to actually post the code  
> for how
> the GSL -- written by at least some of these programmers -- do this?
> Here, I'll try again.  This time I'll use smaller numbers and make an
> actual table of the outcomes:  Imagine only two lousy random bits,
> enough to make 00, 01, 10, 11 (or 0,1,2,3).  Here is the probability
> table:
>
> r =   0    1    2    3
> ------------------------
> p = 0.25 0.25 0.25 0.25
>
> Let us generate N samples from this distribution.  Our expected
> frequency of the occurrence of all of these numbers is:
>
> r =     0      1      2      3
> ---------------------------------
> Np = 0.25*N 0.25*N 0.25*N 0.25*N
>
> Is this all clear?  If I generate 100 random numbers, the expected
> number of 3's is 0.25*100 = 25.  Now apply mod 3  the outcomes are  
> now:
>
> r =     0      1      2      3
> r%3 =   0      1      2      0
> ---------------------------------
> Np = 0.25*N 0.25*N 0.25*N 0.25*N
>
> You now sum the number of outcomes for each element in the mod 3  
> table,
> since we have two values of r that make one value of r%3 and frequency
> clearly aggregates as the outcomes are independent.
>
> r%3 =   0      1      2
> ---------------------------------
> Np = 0.50*N 0.25*N 0.25*N 0.25*N
>
> It is therefore twice as likely that two random bits, modulus 3, will
> produce a zero.
>

If you have a domain of 0..3 where a generator generates and your  
modulo n is
just n-1, obviously that means it'll map a tad more to 0.

Basically the deviation one would be able to measure in such case is  
that if
we have a generator that runs over a field of say size m and we want  
to map
that onto n entries then we have the next formula :

m = x * n + y;

Now your theory is basically if i summarize it that in such case the  
entries
0..y-1 will have a tad higher hit than y.. m-1.

However if x is large enough that shouldn't be a big problem.

If we map now in the test i'm doing onto say a few million to a  
billion entries,
the size of that x is a number of 40+ bits for most RNG's.

So that means that the deviation of the effect you show above the  
order of
magnitued of 1 /  2^40 in such case, which is rather small.

Especially because the 'test' if you want to call it like that, is  
operating in the
granularity O ( log n ), we can fully ignore then the expected  
deviation granularity O ( 2 ^ 40 ).

>>
>> Apologies for the caps. I hope how important this is. You're  
>> claiming all programmers
>> use random numbers in a faulty manner?
>
> They don't.  Only you do.  Everybody else takes a uniform deviate and
> scales it by the number of desired integer outcomes, checking to make
> sure that they don't go out of bounds and thereby e.g. get an  
> incorrect
> endpoint frequency.  The gsl code is open source and it takes two
> minutes to download it and check (I just timed it).  Go on, look.  the
> file is rng/rng.c in the gsl distro directory, the function name is
> gsl_rng_uniform_int.  No modulus.
>
> The exception is (obviously) when the range is a power of 2.  In that
> case ONLY, r%n where r is a binary uint and n is a power of 2 will
> (obviously) equally balance the table above.  Personally I'd use >>  
> and
> shift the bits because it is faster than mod, but suit yourself, after
> you've learned what you are doing.
>
>>
>> This is important enough to further discuss about it.
>>
>> As nearly always you need random numbers from within a given  
>> domain say 0.. n-1
>> So projecting a RNG onto that domain is pretty crucial. How would  
>> you want to do that in a correct manner?
>>
>> In the slot test in fact a simple AND is enough.
>
> No, as I've just proven algebraically.  The correct manner for  
> general n
> is the gsl code, but in rough terms it is n*r/r_max (with care used to
> avoid roundoff errors at the ends as noted).  If you've been using
> modulus, all your results are crap.
>
> Look, the reason God invented the GSL and made it open source is so
> numb-nuts and smart people alike wouldn't have to constantly reinvent
> the wheel, badly.  Use it.  Don't question it -- you obviously aren't
> competent to.  Just use it.  If you want a random integer from 0 to n,
> use gsl_rn_uniform_int.  If you want this for e.g. mt19937 don't write
> the latter, set up the gsl to use it to generate your ints.  Learn to
> use it carefully, use it correctly, but use it.
>
>> It's not interesting to discuss - but yes this strategy makes  
>> money in casino's,
>> you just get thrown out of the casino and end up at the blacklist  
>> if you do.
>
> You are clearly too stupid to be allowed out of the house without a
> caretaker.  I'm not going to walk you through the proof that this  
> isn't
> so as it is openly published and I've already referenced a step my  
> step
> analysis that you can't be bothered, apparently, to actually read.   
> I'll
> just reiterate the previous offer -- I, too, am happy to buy a  
> roulette
> wheel and you can come over and bet Martingale against me all day.   
> Just
> one 0, no limits and no quitting, infinite credit on both sides, we  
> play
> until it is obvious to you that you are losing, have lost, will always
> lose, and the longer you play the more that you will lose.  Loser buys
> the winner a case of truly excellent beer.
>
> Look, why don't you fix your random number code and try again, since
> your simulations are obviously trash.  It isn't difficult to show this
> with simulations, once you actually code them correctly, but I have to
> go and don't have time to do it for you.
>
>    rgb
>
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
>
>