## Decimal Expansion of Pi: Interesting Observations, Part III (Discovery)

Let’s consider distribution of digits in decimal expansion of pi. The claim is that, largely, the number is random, thus digits should be distributed somewhat uniformly. So, since we have 10 digits all together, each digit should be roughly 10% of the overall digit count. Scanning through the first billion digits of pi, we do, in fact, observe 10% distribution for each digit (any questions, see prior posts in the rubric to get yourself up-to-speed). Earlier find that pi’s decimal expansion has somewhat shifted distribution with respect to completion run lengths made me wonder, what would the distribution of digits look like as function of completion run length. Just to summarize the idea: you’re scanning pi’s decimal expansion (first billion digits) until every digit from 0 to 9 has been scanned at least once – this is what I call ‘completion run’. So, there will be many completion runs, each will have its own length, and from the prior posts it would follow that completion run lengths of 100 digits or more are rather rare. However, that the completion run length is somewhat large doesn’t necessarily imply that distribution of digits within the run is anomalous. If it is, that would be a curiosity. Shall we take a look?

So I actually scanned the first billion digits of pi (as before) and constructed their overall distribution. Then, I took all the completion runs > 100 in length, and constructed digit distribution on them as well. Then I compared the two.

First impressions – nothing unusual. Basically, as predicted, each digit is roughly 10% for both overall distribution and the high length completion run distribution. But then I noticed a curious percentage change, which is definitely unexpected for me – within abnormally long completion runs, digit 0 has increased proportion-wise by the same amount as digit 9 has declined. The same is for digits 1 and 8 respectively but not for other digits. Wow. At first we thought it might be an outcrop of magnetic rock, but all the geological evidence was against it. And not even a big nickel-iron meteorite could produce a field as intense as this; so we decided to have a look.

This is not something I expected, especially because random controls show no such redistribution when I compare completion run lengths. Generally, there is no need for one digit to decline in proportion in perfect correspondence with another, especially if we have 9 other digits to choose from. It’s a redistribution, so if there was a certain count of 0s to begin with, some of them can be converted to 1s, 2s, etc, but not necessarily 9s. Why would 0s and 9s be connected? Same for 1s and 8s. I obviously don’t have an explanation for this observation, so it gets curioser and curioser.

By the way, while I’ve been focusing on ’empirical’ analysis of decimal expansion of pi in terms of what I dubbed ‘completion runs’, I found out these runs are connected to a well-known mathematical problem called ‘coupon collector problem’, which asks to figure expected length of such a run. The solution is rather simple and it does predict that if one is given a problem to collect all 10 coupons (in our case digits), it will take, on average, 29 attempts before all of 10 have been seen at least once.

Here is the link:

http://en.wikipedia.org/wiki/Coupon_collector%27s_problem