I.
Take an immortal monkey, and have it bang on a typewriter from now until the end of time, then eventually the monkey would produce the complete works of Shakespeare. The saying goes something like that. Human’s aren’t really good at imagining things that are infinite, so while the saying appeals to some sense of intuition that we have about rare events eventually happening given a long enough time frame, it’s hard to imagine how randomly typing on a keyboard would eventually produce a coherent work of art. Even after an infinite amount of time.
You can read wikipedia about this theorem and almost surely you’ll find a bunch of gobbledy-gook and precise computations that are meaningless to the casual reader. “Okay, blah-blah-blah, the theorem is true, I’m getting a Starbucks.” I can relate, much of my undergrad felt that way. Sorry casual reader, this post is more gobbledy-gook, but I think it’s intuitive gobbledy-gook.
We can reframe the infinite-monkey theorem as follows: Consider a random string of letters, where
is a positive integer. Let
be the probability that the
letters have substring of consecutive letters that spell out some phrase we care about. For example, take
, and let our phrase be the word “fun”, then
is the probability that 10 random letters spells out the word “fun” somewhere inside. We can see pretty clearly that if
, then
, since there aren’t enough letters to actually spell our word “fun”. But when
, then
is meaningful, since there is a non-zero probability this could happen. There is at least one way a random sequence of 10 letters could spell out “fun” somewhere in the middle. In other words, if you give your toddler a bunch of scrabble tiles, and ask her to spell out some random words for grandma, there is a non-zero probability your toddler is about to embarrass you.
So, this let’s us conclude that for , we have
. However, this probability is very small. To see this, let’s think about how many different strings of length
we could make. Since there are 26 letters in the alphabet, we have 26 choices for the first letter, 26 for the second letter, and 26 for the third letter to give
total choices. But there is only one way to spell “fun”, so the probability for
ends up being
. This is basically a lottery.
However, if , then we can spell “fun” in several ways. We could spell it as `Xfun’, or `funX’, where X is just some random letter. How many total ways can we do this? Well, for `Xfun’, the first letter is irrelevant, so there are 26 possible choices there. Then there is only one choice for the next letter, one choice for the third letter, and one choice for the last letter (`f’, `u’, `n’, respectively). So we end up with
ways to do this. In the second case, where X comes last, we also have 26 ways to spell `fun’. Then we have
possible ways to spell `fun’ out of
possible ways. So the probability ends up being
.
Huh.
So even though the increase in probability is minuscule, there is still an increase. Now if we had a monkey typing for an infinite amount of type, we are essentially letting Does it now follow that
? In other words, if we let
get bigger and bigger, so the probability we see the word `fun’ appear somewhere in the n-string get better and better?
Actually, yes.
Intuitively we might see why this is true. The larger is, the more ways we can have the word `fun’ appear somewhere in our string, which is pretty fun itself. But does it work for some sequence of arbitrary length? Can it? Hamlet is about 130,000 words, does it make sense to talk about this?
Yes. We can prove this.
II.
We will prove that if we have some fixed string of letters (we’ll call this a phrase), and a random string of
letters, then the probability the
letter string contains the fixed sub-string of
letters converges to 100% as
is allowed to grow to infinity. That is, we will show that
as
where
is the aforementioned probability.
Let be positive integer and consider a fixed phrase of
letters in some order we care about. Define
as the probability a random string of
letters contains a sub-string of
letters spelling out our desired phrase. We know that
, at least, but we can also show that it is an increasing function of
if we have
letters, the phrase could appear in the first
letters with probability
. If not in the first
letters, then the last letter of the phrase would have to be the
th letter, which happens with some probability
Since these events are mutually exclusive, we add them and get:
So, our sequence of probabilities is an increasing sequence. From elementary real analysis, if we can find a sub-sequence contained in
which converges to
then the entire sequence converges to
.
What kind of sub-sequence does this? Consider , the probability of the desired phrase appearing in a string of length
If
is the probability a randomly selected string of length
does not contain our desired phrase, then
. We can split up our string of length
into
strings of length
, so then the probability our desired phrase appears in at least one of these
strings has probability
Since , letting
implies
. By the squeeze theorem
as
, but
was a sub-sequence of
, so
.
III.
Notice that we actually never specified It was left as some arbitrary integer. If
, the proof still works, so we’ve actually proved that our immortal typing monkey can write Shakespeare on a long enough time frame. In fact, the monkey can type any particular text of finite length (if
, then it’s not clear what happens next).
This proof is a generalization of an exercise I saw in the (excellent) book A Walk Through Combinatorics. Turns out counting is pretty fun. Who knew?
***
- Even though the book is eventually written in the limit, the probabilities for any finite
are extremely small. We probably can’t expect it to happen, even if we had an astronomical amount of monkeys typing until the heat death of the universe. Still neat, though.