[Beowulf] OT: Calculating Extraterrestrial Life - was public random numbers?

Thu Aug 25 18:55:04 PDT 2011

On Aug 25, 2011, at 2:11 PM, Robert G. Brown wrote:

> On Thu, 25 Aug 2011, Vincent Diepeveen wrote:
>
>> I noticed that most generated semi-random numbers with software  
>> generators,
>> had the habit to truely adress a search space of n always in O (n  
>> log n).
>>
>> So if you draw from most software RNG's a number and do it modulo n,
>> with n being not too tiny, say quite some millions or even  
>> billions, then every
>> slot in your 'hashtable' will get hit at least once by the RNG,  
>> whereas data
>> in reality simply happens to not have that habit simply.
>>
>> So true random numbers versus generated noise is in this manner easy
>> to distinguish by this. Now i didn't study literature whether some  
>> other chap
>> some long time ago already had invented this. That would be most  
>> interesting
>> to know.
>
> Some other chap named George Marsaglia (and to some extent another  
> chap
> named Donald Knuth) have already invented this.  A number of tests of
> the tails of random number generators are already in dieharder.  All
> "good" modern rngs pass these tests.
>
> The Martingale betting system you are looking at is even older (at  
> least
> Marsaglia and Knuth are still alive).  It dates back to the 18th
> century, and is well known to be flawed for a variety of reasons, not
> the least of which is that gamblers don't have the infinite wealth
> necessary to make this >>even<< a zero-sum strategy and casinos have
> betting limits that de facto make it impossible to pursue the  
> requisite
> number of steps and in roulette in particular have 0 and/or 00  
> slots and
> aren't zero-sum to begin with.  You can read a decent analysis of
> outcomes based on the presumed binomial distribution of a zero-sum  
> game
> here:
>
>   http://en.wikipedia.org/wiki/Martingale_%28betting_system%29
>
> Your test below is interesting, though.  The only real problems I can
> see with actually using it in dieharder are:
>
>   a) One would need a theoretical estimate of the distribution of
> filling given n log n draws on an n-slotted table (for largish n).   
> That
> is, for a perfect rng, what SHOULD the distribution of success/failure
> be.
>
>   b) One would then need the CDF for this distribution, to be able to
> turn the results of N trials (of n log n pulls each) into a p-value
> under the null hypothesis -- the probability of obtaining the  
> particular
> number of successes and failures presuming a perfectly random  
> generator.
>
> That way dieharder could apply it rigorously to its 70 or 80 embedded
> rngs or to any user's outboard generator.  There probably is  
> theoretical
> statistical support for the PD and/or CDF -- you're analyzing the  
> tails
> of a poissonian process -- but finding it or doing it yourself (or
> myself), aye, that's the rub.  One cannot just say "high degree of
> certainty that it is an RNG" (by which one means that the rng in
> question fails the test for randomness) in the test.  HOW high?   
> Perfect
> rngs or perfectly random processes will sometimes fill your table, but
> how often?  How can you differentiate an "accident" when one does from
> an actual failure?  All of those questions require a more rigorous
> theory and quantitative result embedded in a test that can be
> systematically cranked up to more clearly resolve failures until they
> are unambiguous, not marginal maybe yes maybe no.
>
> I suspect that the failures this test would reveal are already more  
> than
> covered in dieharder, in particular by the bit distribution tests and

Thanks for your kind words - you'll realize that, seeing all the  
theories you quote
below where i simply have few knowledge from and definitely not the  
time for
to investigate (yet), you're talking way above my level of knowledge  
there.

Instead of going deep into mathematical theories i would find it more  
appropriate
to ponder on the feasability to calculate the existance of  
extraterrestrial life.

Now i realize a lot of efforts go into recognizing messages from  
outer space.

Yet we can also speculate on some things.

First of all i'd like to make a statement on extra terrestrial life  
and a viewpoint there.
One viewpoint i've seen promoted is that some scientist(s) claim we  
should
hide ourselves for extraterrestrial life.

I fully disagree there to some extend. If there is extraterrestrial  
life that is more advanced than our
society, obviously they also could have build weapons to totally  
selfdestruct and would already
have killed themselves if they would have been agressive forms of life.

If they were succesful in reproducing themselves, just like humankind  
is right now,
they would have burned up all resources at their own planet, caused  
massive
extinctions. It is difficult to defend the statement that all mass  
extinctions were caused by meteorites only -
for such statement one would need a proof for every single mass  
extinction being caused by a specific
meteorite; the extinction could just have been that a certain  
succesful species dominated the planet a tad too much
and didn't get clever enough to selfcontrol nor selfregulate to an  
extend that the planet didn't entirely
die. After some millions of years life restores itself then on the  
planet.

So if there would be such an intelligent life elsewhere more advanced  
than our society
is, they sure would want to communicate in a manner that information  
could get read
through different galaxies.

However for the most primitive life forms that are succesful to  
dominate a planet one
would want to hide this information for until such society reaches a  
specific level.

One would only want intelligent life to decypher
such extraterrestrial form of communication by another extra  
terrestrial form of life,
where the form of life is of a sustainable peaceful level.

I would argue such a lifeform would not form a threat to anyone, as  
they already
have proven to not be a threat to their own planet. So if the  
knowledge of this
society is high enough to be able to control all that, one would also  
be able to argue
that belonging to that high level of development,
would belong a specific level of math. A level strong enough to  
decypher the form
of communication that gets used to communicate between the different  
very intelligent
lifeforms in existance through the galaxies.

 From the fact that there is not a systematic form of contact with  
extraterrestrial life
we can already deduce that humankind still has to develop itself  
further from a species
that burns up all its resources, especially causing too much output  
of CO2 (the latest
report i'll have to check out
is that the increased CO2 level increases the amount of CO2 absorbed  
by the
oceans causing it to get more sour, causing plankton, start of the  
foodchain, to
not develop its skeleton enough, which for sure in the long run will  
cause mass
extinction).

Now we might not be advanced enough yet to decypher extraterrestrial  
communication,
so i wonder whether we might be able to recognize somehow that there  
is information getting
communicated using a form of encryption that we simply cannot  
decypher yet, based upon
comparing it versus how our RNG's work. Some of them run for example  
over a primefield,
others have a distribution too perfect.

If we get from space radiation measurements back, and we test them  
for belonging in a specific
class or type of randomness versus non randomness; how does that  
compare with if we have a source
of radiation ourselves that's comparable to that and its randomness  
classification?

Obviously the algorithm i gave is just one specific form of algorithm  
to measure a perfect distribution -
as you already indicated there are many other tests invented already.

In how far have those been applied to what could be encrypted  
communication from extraterrestrial life
to other extraterrestrial life (like us if we manage to survive as  
species and develop further to a
peaceful level that can sustain itself for a longer period of time).

So summarized what i wonder about is in how random number theory can  
contribute to detecting
extraterrestrial life (of course with a specific statistical  
significance to it).

This of course in combination with experiments conducted that allow  
us to first classify how a specific form of
possible communication system would behave normally spoken according  
to the randomness classification system,
versus the classification on how the measured possible form of  
communication compares to that.

Such classification system would need to be very sophisticated to  
have any chance of detecing extraterrestrial life
i'd guess, as we can't just naively assume that all they could come  
up with is encrypting things over a primefield using
smallish primes which in our world already only is allowed to be used  
upto secret level.

Regards,
Vincent

> the monkey tests, but I'm not terribly happy with the monkey tests and
> would be perfectly thrilled to have a simpler to compute test that
> revealed precisely this sort of flaw, systematically.  And it doesn't
> hurt at all to have partially or fully redundant tests as long as the
> test themselves are rigorously valid.  If you can find or compute the
> CDF for your test below, I'd be happy to wrap it up and add it to
> dieharder, in other words.  One can always SIMULATE a CDF, of course,
> but that requires a known good generator and sort of begs the question
> if you don't think that e.g. AES or threefish or KISS are good
> generators that would actually pass your test.
>
> Even hardware/quantum sources of random bits are suspect -- they often
> are generated by a process that leaves in the traces of an underlying
> distribution.  I'm not convinced that >>any<< process in the real  
> world
> is >>truly<< random.  Physics is ambiguous on the issue -- the quantum
> description of a closed system is just as deterministic as the  
> classical
> one, and Master equation unpredictability on open subsets of a large
> closed system reflects entropy/ignorance, not actual randomness (hence
> Einstein's famous "doesn't play dice" remark).  But lots of this are
> sufficiently random that one cannot detect any failure of randomness,
> modern crypto class generators being a prime example.
>
>    rgb
>
>>
>> In semi pseudo code, let's take an array of size a billion as an  
>> example,
>> though usually a few million is more than ok:
>>
>> n = 2^30; // 2 to the power 30
>>
>> Function TestNumbersForRandomness(RNG,n) {
>> declare array hashtable[size n];
>>
>> guessednlogn = 2 * (log n / log 2) * n;
>>
>> for( i = 0 ; i < n ; i++ )
>>   hashtable[i] = FALSE;
>>
>> ndraws = filledn = 0;
>> while( ndraws  < guessednlogn ) {
>>    randomnumber = RNG();
>>    r = randomnumber % n; //     randomnumber =  r  (mod n)
>>    if( hashtable[r] == FALSE ) {
>>       hashtable[r] = TRUE;
>>       filledn++;
>>       if( filledn >= n )
>>         break;
>>
>>   }
>>   ndraws++;
>> }
>>
>> if( filledn >= n )
>>    print "With high degree of certainty data generated by a RNG\n");
>>  else
>>    print "Not so sure it's a RNG\n";
>>
>> }
>>
>>
>>
>>
>>
>> Regards,
>> Vincent
>>
>>
>>
>>
>>> -- both unpredictable and
>>> flat/decorrelated at all orders, and even though there aren't really
>>> enough of them for my purposes, I've used them as one of the (small)
>>> "gold standard" sources for testing dieharder even as I test  
>>> them.  For
>>> all practical purposes threefish or aes are truly random as well and
>>> they are a lot faster and easier to use as gold standard generators,
>>> though.
>>> I don't quite understand why the single site restriction is  
>>> important --
>>> this site has been up for years and I don't expect it to go away  
>>> soon;
>>> it is quite reliable.  I don't think there is anything secret  
>>> about how
>>> the numbers are generated, and I'll certify that the numbers it  
>>> produces
>>> don't make dieharder unhappy.  So 1 is fixable with a bit of  
>>> effort on
>>> your part; 6 I don't really understand but the guy who runs the  
>>> site is
>>> clearly willing to construct a custom feed for cash customers, if  
>>> there
>>> is enough value in whatever it is you are trying to do to pay for
>>> access.  If it's just a lottery, well, lord, I can think of a  
>>> dozen ways
>>> to make numbers so random that they'd be unimpeachable for any  
>>> sort of
>>> lottery, both unpredictable and uncorrelated, and they don't any  
>>> of them
>>> require any significant amount of entropy to get started.
>>> I will add one warning -- "randomness" is a rather stringent
>>> mathematical criterion, and is generally tested against the null
>>> hypothesis.  Amateurs who want to make random number generators  
>>> out of
>>> supposedly "random" data streams or fancy algorithms almost  
>>> invariably
>>> fail, sometimes spectacularly so.  There are a half dozen or more
>>> really, really good pseudorandom number generators out there and  
>>> it is
>>> easy to hotwire them together into an xor-based high entropy  
>>> stream that
>>> basically never repeats (feeding it a bit of real entropy now and  
>>> then
>>> as it operates).  I would strongly counsel you against trying to  
>>> take
>>> e.g. weather data and make something "random" out of it.  Unless you
>>> really know what you are doing, you will probably make something  
>>> that
>>> isn't at all random and may not even be unpredictable.  Even most
>>> sources of "quantum" randomness (which is at least possibly "truly
>>> random", although I doubt it) aren't flat, so that they carry the
>>> signature of their generation process unless/until you manage to
>>> transform them into something flat (difficult unless you KNOW the
>>> distribution they are producing).  Pseudorandom number generators  
>>> have
>>> the serious advantage of being amenable to at least some theoretical
>>> analysis (so you can "guarantee" flatness out to some high
>>> dimensionality, say) as well as empirical testing with e.g.  
>>> dieharder.
>>> HTH,
>>>
>>>     rgb
>>>> Thanks,
>>>> David Mathog
>>>> mathog at caltech.edu
>>>> Manager, Sequence Analysis Facility, Biology Division, Caltech
>>> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
>>> Duke University Dept. of Physics, Box 90305
>>> Durham, N.C. 27708-0305
>>> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>>> Computing
>>> To change your subscription (digest mode or unsubscribe) visit  
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
>
>