Pseudorandom number generators

It's a well-known programming fact that computers do exactly and only what you tell them to do (ignoring, possibly, the incidence of cosmic rays).  This fact makes it impossible for a computer to produce a truly random number since, given a program's particular starting state and inputs, the computer will always produce the same output - so if one were to know the starting state and inputs one would always be able to predict the output state.  Truly random processes cannot be predicted in this way.

Despite this, computers are used constantly to produce "random" behavior, from online poker websites to monte carlo simulators to photo screensavers that always show the wrong photo at the wrong time.  To generate these to-human-eyes-random effects, programmers use what are known as pseudorandom number generators, which algorithmically produce numbers which are hard to predict if you do not know the initial inputs (known as seeds).

One of the best and most well-known pseudorandom number generators is the Mersenne Twister (here's a link to the github repository of a guy who niftily built a twister in ruby and python).  The Mersenne Twister is prized among pseudorandom number generators for having an extremely long period - the length of the sequence of numbers it produces before repeating itself.   The Mersenne Twister is used to produce random numbers in languages like C++ and Ruby.  While theoretically it is possible to produce a pseudorandom number generator that never repeats itself (over a long enough sequence I mean, the same way that the decimal representation of a transcendental number never repeats its digits - you can always find a period large enough to not have repetition of whatever sequence you want), as far as I am aware all pseudorandom generators eventually either repeat themselves or devolve into predictable number sequences.

In fact, since most pseudorandom generators use their outputs as inputs to produce their next results, given that a pseudorandom number generator produces a finite output (and all real-life uses require this), this guarantees that there are a finite number of possible outputs, which guarantees that such a pseudorandom generator will repeat itself, since there are only a finite number of possible sequences for a finite collection of possible outputs.

One thing I am working on is deriving how best to get particular random numbers from a pseudorandomly generated number.  For instance, the Mersenne Twister produces enormously large numbers - how should one then use it to derive, say, a random number between 1 and 24?  One simple way is simply to take the modulo of the number and the size of the random numberspace you want to select from (in this case, 24).  This method does produce slightly uneven results if the index of the possible numbers the pseudorandom generator can produce does not evenly divide the selected numberspace - for instance, if your pseudorandom generator produces results from 1 to 25, taking the modulo of the result will make a 1 twice as likely as any other number since 1 modulo 24 and 25 modulo 24 both produce 1.  However, since most pseudorandom generators generally produce a huge range of outputs (the Mersenne Twister produces 219937 − 1 possible numbers, for instance), this should be functional enough for most uses.

Comments

Popular posts from this blog

The Sorting Hat

Kadane's Algorithm

Loose Equality, the Null Operator, and Other Oddities