Like most discussions of randomness, this one went straight to philosophy! Randomness is a pet interest of mine and I registered just to write this long post. If you're not interested in what randomness really means in computer science / computational theory, this giant wall of text is going to be very boring!
In mathematics there are varying definitions of "random", depending on what you care about. The first thing to note is that
a number is not random. Only processes/infinite sequences/etc. are random. A selection from any finite sequence - whether one number or a million numbers - is never "random".
A random number generator, therefore, does not produce a random number. It produces the next number in a random sequence.
This, in the most simplest terms, is why you shouldn't change your bets even when the dice are "hot". Whatever pattern of dice rolls you just saw isn't what we call random. The
process of rolling dice is a random process. What you saw was just a selection from that process.
The biggest, baddest definition of randomness is an
algorithmically random sequence, also called
Martin-Lof randomness. There are several definitions, all of which rely on fairly abstract computational theory. But basically it comes down to...
You wrote a program (in whatever language you like) that takes a finite sequence of numbers. When run, it will tell you the minimum number of lines of Basic code it would take to display the input sequence. For example, I choose 10 numbers at "random" and your program reports "You need at least 30 lines of Basic to print those numbers." I give you a different set of numbers, maybe 25 this time, and your program reports "You need at least 40 lines of Basic to print those numbers."
I starting giving you sets of numbers to plug into your program. I made up these sets, who knows where they come from. Some of the sets I give you are short and sweet: 1 2 3. Here's a three line Basic program that prints that out:
Other sets I give you are long, but simple: 1 2 3 ... 100. There are 100 numbers in that set, but it only takes a 5-line Basic program with a loop to print it out.
Code:
let x = 1
while x <= 100
print x
x = x + 1
loop
These Basic programs are compressed versions of the input. In other words, the Basic program is the same size or shorter (in lines of code) than the input. n the first case, it was the same size. In the second, the Basic code only had 5 lines, but the input set had 100 numbers.
The difference between the size of the set and the size of the smallest program that will print out that set is called
c. In the first example,
c = 3 - 3 = 0. In the second,
c = 100 - 5 = 95. Intuitively,
c measures how well a set of numbers can be compressed in a specific programming language. If you can't do better than a particular
c value, then the set you tried to compress is called
c-incompressible.
Now, rather than just making up sets of numbers, you challenge me to use some infinite sequence of numbers and pull sets from that. I can give you short sets, long sets, massively long sets, but so long as they come from some magic sequence in my pocket, you'll type them as input into your program and find out how short of a Basic program we can write to output them.
As an example, let's use the infinite sequence 1, 2... If I pull out the first 3 numbers (like our first example) your program reports
c = 0. If I pull out the first 100 numbers (like our second example) your program reports
c = 95. Because my sequence is so simple, it doesn't take much code even if I pull out really long sets.
Now I challenge you. "Is there anything in my sequence that you can compress with a
c bigger than 500?" You say "Sure! Take the first 600 numbers from your set. My program reports that you can write a 5 line Basic program - in fact, the same program we used above - to output that set.
c = 600 - 5 = 595, that's bigger than 500."
Clearly, no matter what
c I challenge you with, you can find a set of numbers taken from my sequence that you can compress better than
c.
Now I'm annoyed so I throw away that simple sequence and I come up with something ludicrously complicated. In fact, it's so complicated that you can't see any discernible pattern. Good thing you have your magic program that will tell you how short a Basic program we can write. So I challenge you with
c = 100,000 and you start looking for really,
really long sets from the sequence that can, according to the program, be outputted with programs short enough that the difference between the set size and the program size is greater than 100,000.
Unfortunately, there's an infinite number of sets you can pull from my infinite sequence. If you happen to find one that results in
c > 100,000 then we're done, you beat my challenge. I can pick an even bigger
c and try again. But if you run this thing for all eternity and we hire a mathematics professor to prove, on paper, that it
impossible to beat my challenge, then my infinite sequence is
algorithmically random.
To restate it, if you're allowed to compress any starting set from my infinite sequence and you're allowed to go at it for all eternity and it is impossible to ever find a single sequence for which a Basic program can be written in fewer than 100,000 lines of code
less than the length of the set, the sequence is random.
-
Now some small notes.
The choice of Basic was arbitrary. If an infinite sequence is algorithmically random for
some programming language and
some c value specific to that programming language, then it's algorithmically random for
all programming languages. Each will have a different
c value.
In reality, we don't have handy little programs that figure out the smallest Basic program to output some set of numbers. Instead we use our big monkey brains and computer science degrees to prove on paper the limits on compression.
It turns out to be really,
really tough to find an algorithmically random sequences. I know exactly one, and I know that at least two others were discovered. Algorithmically random sequences are really tough to describe - even vaguely in English. It's part of the fact that they're incompressible, and describing something in English is, in a way, compression.
At the same time, however, it turns out that almost every infinite sequence that can exist in the world of math is algorithmically random. The basic foundations of mathematics support an uncountably infinite number of algorithmically random sequences, but we barely know what any of them are.
-
If anyone is interested in an example I can describe the "halting sequence" - a relatively simple example of a truly random sequence. I can also talk about sequences that pass this test not for Basic, but for systems with less computational "power". Not what we think of as computers, that is. I can provide lots of examples of those. We call them "relatively random" compared to whatever weak "computer" we're comparing against, like an abacus.
Alternatively, if I just blew up half your thread with my first post on this forum and you're angry about it, I can keep quiet
But that, philosophy geeks, is real randomness! There are sequences of numbers that are so complex that they don't contain even a single sub-sequence that can be compressed by some fixed amount, no matter what, by any computer,
ever.
(Finally, this is a hobby area of interest for me, and if you are a real expert feel free to correct the many mistakes I probably made!)