## Sunday, June 30, 2019

### The law of small numbers

I was listening to a podcast when I heard about an interesting probability result in the same vein as the Monty Hall Problem.  The new problem is this: Flip a coin 100 times and record the results.  Now pick random flips in the set and see if the next 3 flips are all heads; if so we call this a streak.  Repeat until you find a streak of 3.  Now what is the probability that the 4th flip is also heads?  Is it 50% like we would expect?  It turns out to be closer to 46%, which is not very far from 50%, but is also a clear trend.

You can download the paper here, and I recommend you read through the introduction, which is pretty easy to follow.  I think does a good job of explaining what is going on.  Since no one will do that, here is a table from the paper which helps give some intuition.

This represents every possible outcome from flipping a coin 3 times and looking for a 'streak' of 1 heads.  There are eight total possible outcomes, all equally likely.   In the first two, the streak of 1 heads never happens, or happens on the last flip where there is no following flip to look at.  Those are thrown away and ignored.  In the other six possible outcomes we do get a streak, at least once, and earlier than the last flip.  The underlined flips represent the possible candidates for the flip that is following a streak.  If we pick the preceding streak, then the underlined flips will be the one we are trying to predict.  In three out of the six outcomes with a streak, the following flip will not be heads.  In two out of the six outcomes the following flip will always be heads.  And in the remaining possible outcome it could be either head or tails with 50/50 probability depending on which streak you pick.

If you list out all the possible outcomes from any combination of streak length and total flips, you can see that some number of the heads flips are 'consumed' by the streaks themselves.  Those flips can never be following a streak, because they are part of the streak needed to define the streak.  On the other hand, the tails have no restrictions, they are all available to occur in the flip immediately following a streak.  There are simply more tails available to go in the candidate position.  The effect gets smaller as you decrease the streak length or increase the total number of flips in a set.

I found this very surprising, so I wanted to test it out.  I wrote a Ruby script to simulate various coin flips and look for streaks of different lengths, and output the results.  I then decided to rewrite it in a compiled language so it would be faster.  I decided to try out Go, as I've never used it before and I was hoping for something with a bit more syntactic sugar than C.

https://github.com/StephenWetzel/coin-flips-go

Here are the results of a bunch of combinations of streak lengths and numbers of flips from the Go program:
Looking for a streak of length  1 in    10 total flips. Performed 10000 rounds, and   9973 were successful, found 45.29% continued the streak.
Looking for a streak of length  1 in   100 total flips. Performed 10000 rounds, and  10000 were successful, found 49.43% continued the streak.
Looking for a streak of length  1 in  1000 total flips. Performed 10000 rounds, and  10000 were successful, found 49.91% continued the streak.
Looking for a streak of length  2 in    10 total flips. Performed 10000 rounds, and   8203 were successful, found 38.16% continued the streak.
Looking for a streak of length  2 in   100 total flips. Performed 10000 rounds, and  10000 were successful, found 47.72% continued the streak.
Looking for a streak of length  2 in  1000 total flips. Performed 10000 rounds, and  10000 were successful, found 50.15% continued the streak.
Looking for a streak of length  3 in    10 total flips. Performed 10000 rounds, and   4797 were successful, found 34.88% continued the streak.
Looking for a streak of length  3 in   100 total flips. Performed 10000 rounds, and   9995 were successful, found 45.84% continued the streak.
Looking for a streak of length  3 in  1000 total flips. Performed 10000 rounds, and  10000 were successful, found 49.78% continued the streak.
Looking for a streak of length  4 in    10 total flips. Performed 10000 rounds, and   2152 were successful, found 35.83% continued the streak.
Looking for a streak of length  4 in   100 total flips. Performed 10000 rounds, and   9637 were successful, found 40.61% continued the streak.
Looking for a streak of length  4 in  1000 total flips. Performed 10000 rounds, and  10000 were successful, found 49.21% continued the streak.
Looking for a streak of length  5 in    10 total flips. Performed 10000 rounds, and    985 were successful, found 37.36% continued the streak.
Looking for a streak of length  5 in   100 total flips. Performed 10000 rounds, and   7860 were successful, found 38.66% continued the streak.
Looking for a streak of length  5 in  1000 total flips. Performed 10000 rounds, and  10000 were successful, found 48.91% continued the streak.
Looking for a streak of length  6 in    10 total flips. Performed 10000 rounds, and    388 were successful, found 35.82% continued the streak.
Looking for a streak of length  6 in   100 total flips. Performed 10000 rounds, and   5190 were successful, found 35.24% continued the streak.
Looking for a streak of length  6 in  1000 total flips. Performed 10000 rounds, and   9996 were successful, found 46.68% continued the streak.
Looking for a streak of length  7 in    10 total flips. Performed 10000 rounds, and    140 were successful, found 40.71% continued the streak.
Looking for a streak of length  7 in   100 total flips. Performed 10000 rounds, and   2997 were successful, found 33.83% continued the streak.
Looking for a streak of length  7 in  1000 total flips. Performed 10000 rounds, and   9761 were successful, found 42.40% continued the streak.
Looking for a streak of length  8 in    10 total flips. Performed 10000 rounds, and     52 were successful, found 36.54% continued the streak.
Looking for a streak of length  8 in   100 total flips. Performed 10000 rounds, and   1634 were successful, found 33.60% continued the streak.
Looking for a streak of length  8 in  1000 total flips. Performed 10000 rounds, and   8365 were successful, found 38.27% continued the streak.
Looking for a streak of length  9 in    10 total flips. Performed 10000 rounds, and     17 were successful, found 47.06% continued the streak.
Looking for a streak of length  9 in   100 total flips. Performed 10000 rounds, and    784 were successful, found 33.04% continued the streak.
Looking for a streak of length  9 in  1000 total flips. Performed 10000 rounds, and   6037 were successful, found 35.80% continued the streak.
Looking for a streak of length 10 in    10 total flips. Performed 10000 rounds, and      0 were successful, found NaN% continued the streak.
Looking for a streak of length 10 in   100 total flips. Performed 10000 rounds, and    381 were successful, found 30.71% continued the streak.
Looking for a streak of length 10 in  1000 total flips. Performed 10000 rounds, and   3615 were successful, found 33.91% continued the streak.