Verifying uniform distribution of random numbers

Jeremy jepri at
Tue Apr 9 15:35:00 EST 2002

<snip >

>The approach I have in mind is to declare an array of 65024 "bins" (one per

>valid IP), and count the number of times that each IP occurs. Then do some

>comparison of actual count in each bin against the expected value (1598 or
>- I'll probably tweak the number of IPs to make it match).

A chi-squared test would be appropriate here, however...

>The question is "how close is close enough for each bin?", given that there
>no confidence levels in the IETF draft, so "something reasonable" would have

>to be assumed. If anyone has played with this, I'd be keen to hear some 

In science, a confidence of anywhere from 10% down is considered 'good enough'
for various definitions of good enough.  In biology classes 5% was the target,
with 1% being desirable.  If you were putting your reputation on the line, .1%
or .01% would be nice.

However you have less stringent needs - you just have to make sure that your
program will pick different IPs each time so it doesn't get into a never-ending
fight over, say, a few thousand IPs.

Remember that starting at 1 and counting to 1000 gives you an uniform distribution.

If it's really, really really, important to get a perfectly uniform distribution,
just record the numbers you have already tried, and make sure you don't pick
them again.  This will ensure an uniform distribution regardless of your random
number generator.

The most important thing I learned from stats was 'cheat wherever possible'.

More information about the linux mailing list