Nemo's bash challenge for the day

Sun Mar 31 18:16:28 EST 2002

On 2002.03.31 12:48 Michael Still wrote:
> On Sun, 31 Mar 2002, Martijn van Oosterhout wrote:
> 
> > On Fri, Mar 29, 2002 at 10:14:17AM +1100, Michael Still wrote:
> > > PS: How good is the bash random? See the attachment output.count
> for a
> > > summary of 100,000 numbers between 1 and 10 being generated with
> the
> > > default seed. It's probably good enough for most people.
> >
> > > 1: 9964
> > > 2: 10009
> > > 3: 9978
> > > 4: 10015
> > > 5: 9997
> > > 6: 10020
> > > 7: 10011
> > > 8: 10008
> > > 9: 10006
> > > 10: 9992
> >
> > You do realise that if they were all exactly 10,000 that would be a
> definite
> > sign of non-randomness!
> 
> Ummm, surely for an extremely large sample you would expect to see an
> equal frequency for all of the options?

I had this concept explained to me by my biology lecturer, who claimed 
that he was drinking in a county pub on Anzac day while the patrons 
were playing two-up.  The spinner (the guy who throws the coins in the 
air) threw 20 double headers and was promptly beaten to an inch of his 
life for cheating.  The odds of two normal coins doing this are ( I 
think ) 1 in 4^20 - pretty unlikely.

This is the basic idea in statistics ( apart from the beating people 
bit ).  If you flip a coin ten times, and you get ten heads, you would 
suspect a trick coin.  But there is a 1 in 2^10 chance of getting ten 
heads with a completely normal coin.

Every time someone gives you a statistic, it should also have the 
calculated chance of that measurement occurring completely by 
accident.  If someone hands you a statistic without this "confidence 
number", the statistic is basically useless.

The math for your question gets a bit hairy.  You are correct that for 
an *infinite* sample, you would expect the same average every time.  
Since it isn't possible to take an infinite sample, Martijn is closer 
to being 'right'.