## Analytics: Signal Detection – Part 05 – Solution (Benford’s Law)

There is a phenomenon called Benford’s law – distribution of digits in accounting figures and many other measurements from real-world datasets is not entirely uniform (seems like number 1 is much more common than one would think.) So I thought to model a dataset which would use synthetic distribution of digits. You would think that every digit has an equal chance of being present. But if you combine all values for any particular signal into one string and look at the digit distrubution within that string, you’ll notice that there is one (and only one digit) that stands out from the rest. The tableau below shows the analysis on previously posted dataset. This should immediately yield correct answer: signal T is associated with Q, R with K, etc.

http://en.wikipedia.org/wiki/Benford%27s_law

Now, the question is, how are you supposed to figure out this hidden feature? Looking at the raw numbers, I certainly can’t spot anomalous distribution of digits with a naked eye. What makes it so hard to see is the fact that we rarely think of digit distribution (that’s why mysterious overtone to the previous post). Your boss sends you a dataset and asks to figure out what is wrong with it, and your first instinct (as business analyst) is to look at digit distribution? Ha ha, unlikely, unless you’re reading books like ‘Alien IQ Test’ by Clifford Pickover.

So, is it possible to arrive at the solution through one of the standard analysis routes? I don’t know, you tell me…

```
/*
************************************************************
Author: Monsi Terdex;
Date: 05/18/2013
Description:
- Benford's law puzzle
************************************************************
*/
#include
#include
#include
#include
#include
#include
```