Random Post (is it? :P)

Hey all,

I discussed recently how Fibonacci sequence is used to generate random sequences. It can be read here Fibonacci and Randomness.  As I said in the end I will elaborate it more but before that from the feedback I got; I thought it will be better if a post on what does it mean by random sequence/numbers is written first. So here we go. But hey why so serious? Have some fun first …

credits – xkcd.com

It is but obvious that we generate a random sequence and not a number. As the sequence follows no regularity next number in the sequence will be unpredictable; ideally. However, practically it is a daunting task that may not be possible at all. Although one can get sufficient randomness to work things out. I will be discussing mainly Algorithmically random and statistically random sequences. Though there are various classifications and definitions which are not covered here.

A statistically random sequence is one that contains no recognizable patterns or regularities. e.g. digits of Π.

An algorithmically random sequence is one with binary digits that appears random to any algorithm. This definition can not be applied equally well on any finite set of characters but naively applied in practice. (from wikipedia)

Let’s get to statistically random sequence first. As readers might have guessed by the use of dubious ‘recognizable’ that these are not random sequences in truest of sense. Yes they need not be. Again arguing such a thing might not be possible at all – complete disorder is impossible. This also implies that large sequences might have subsequences which are not random! This brings us to the concept of local and global randomness. As indicated global random cares about the overall picture. Local randomness employs the idea of a minimum sequence length in which random distributions can be approximated. Thus long stretches means the local randomness is decreased; even if generated by truly random process. So how to test it?

The first tests were published by M.G.Kendall and Bernard Babington Smith. They were hypothesis tests based on the idea that given a random sequence every number has an equal chance of occurring and numerous patterns are distributed equiprobably. The original 4 tests were

  1. frequency test – as all nos. are equiprobable they must be roughly equal in no.
  2. serial test – checking frequency for pair of digits like 00,01 etc.
  3. poker test – test for sequences of 5 digits. This is based on hands in a game of poker.
  4. gap test – check gap between two zeroes. e.g. 00 has a gap of 0, 023940 of 4 and like that. All the gaps must be roughly equal in no. of occurrences.

If a given sequence passes all these tests within an acceptable limit (5% normally) it is locally random. Kendall and Smith differentiated local randomness from true randomness as many of the large sequences contained large rows of the same digits. As time progressed tests became more sophisticated – diehard tests developed by George Marsaglia are a battery of statistical tests to measure quality of a random number generator. They were first published on CD-ROM of random numbers in 1995. There are several other popular tests but as post will compete with Pan-American highway in length I am skipping them not before discussing monkey tests(they are one of the diehard tests). Infinite Monkey theorem states that given enough time a monkey typing randomly, as a part of output, will produce all Shakespearean plays. Thus monkey tests provide random inputs for testing. In our case treat sequence of bits as words and look for overlapping words in the sequence. The words that don’t appear has to follow a known distribution. I agree I am leaving it largely unexplained but I’ll make sure to cover for it in recent future.

In case of algorithmic; there are different forms or, to put it better, notions of randomness. This is because there are different types of algorithms. Most common of them is Martin-Löf randomness or 1-randomness, named after Per Martin-Löf a Swedish logician, mathematical statistician. There are other stronger/weaker forms but being so popular that if no clarification given the term ‘random’ is usually referred to above.  This will be discussed thoroughly in next post.

Just to give you an idea the original definition was in terms of constructive null covers. Now what are they? Again it is such a small space to explain it from scratch. This might clear some fog – A set of measure zero is a set of reals, a subset of the unit interval with the property that we can cover it with coverings that are arbitrarily small. More precisely, Martin-Löf defines a set X to be of measure zero (constructively) if there is a computable sequence of coverings An all of which cover X, i.e.    ∀n [ XAn ]. (puzzled? leave it. It may get clear in next post)

How is this related to algorithms, math nerd? Uh did someone ask this? Well all this is based upon what is computable by some turing machine. Martin-Löf’s idea was to use theory of computation to formally define randomness. The constructive null cover will thus determine whether given sample is computable on considered turing machine. Believe it or not this was a path-breaker.

And the Legend of randomness ends here……………….

Oh did you believe that? –  how un-random you are 😛

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s