[clug] Spambayes

David Gibson david at gibson.dropbear.id.au
Sat May 24 14:47:57 EST 2003

On Fri, May 23, 2003 at 12:49:26PM +1000, Mark Triggs wrote:
> Nemo -earth native- <nemo at cheeky.house.cx> writes:
> > I'm increasingly finding I'm putting thought into this... SO, straw poll
> > time. 
> >
> > Q: What spamfiltering system do people use?
> Brett has already mentioned his success using a Bayesian filter, but
> I'll just quickly throw in my two cents anyway :o). I'm using bogofilter
> (another Bayesian filter) and have found that within a short
> amount of time it catches most of my spam with very few false positives.
> > Q: Is it trainable?
> Yep.
> > Q: If it is, do you update it with ALL messages once classified as
> > spam/ham, or only with false pos/neg results? (or not at all? Or
> > something else?)
> I trained mine by feeling it the last 10 ham messages from each of the
> mailing lists I'm subscribed to, and all of the spam I currently had
> sitting in my spam box (somewhere between 30 - 40 messages). Over the
> next week it might have missed one or two spam messages and gave one
> false positive (mail from an outlook user), but since then it seems to
> have Just Worked.
> Once you're confident that it is unlikely to catch false positives, you
> can have it update its lists of spam/ham words automatically as it
> receives new messages, so little manual intervention is required.

Again, that's something you want to be really cautious about:  if it
ever does start getting the wrong idea, it will reinforce itself and
drift even further off track

David Gibson			| For every complex problem there is a
david at gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.

More information about the linux mailing list