rweir at ertius.org
Fri May 23 01:02:55 EST 2003
On Thu, May 22, 2003 at 12:29:33PM +1000, Nemo -earth native- wrote:
> On Wed, May 21, 2003 at 12:15:19PM +1000, David Gibson did utter:
> > >
> > > Personally, I'd be interested to see a comparison of the relative
> > > effectiveness of bogofilter, spambayes, spamassassin, etc... for
> > > example, train each system on the same block of spam and ham, and then
> > > test each on a new block of email known to contain both ham and spam...
> I'm increasingly finding I'm putting thought into this... SO, straw poll
> Q: What spamfiltering system do people use?
spamassassin, version 2.53.
> Q: Is it trainable?
> Q: If it is, do you update it with ALL messages once classified as
> spam/ham, or only with false pos/neg results? (or not at all? Or
> something else?)
I carefully weeded out some mailing list folders, and created a fairly
large spam folder (~2500) and maybe triple that of ham. Ran that
through sa-learn (which took aaaages), and then just let spamassassin
start sorting my mail. It actually did a fairly good job (this was
immediately after 2.50 was released, I think) right off the bat, with no
false positives, and very few false negatives.
To continue training it, I have these key bindings for mutt: 'y' feeds a
message as spam to sa-learn, and moves it to =spam/generic-spam/ (my
spam goes into several different folders depending on what marked it as
such). 'Y' feeds a message to sa-learn as ham. It's mostly for
symmetry, and also as a quick fix if I'm too fast with 'y'.
macro index 'y' "<enter-command>unset wait_key\n<pipe-entry>sa-learn
--no-rebuild --single --spam > /dev/null 2>&1 &\n<enter-command>set
macro pager 'Y' "<enter-command>unset wait_key\n<pipe-entry>sa-learn
--no-rebuild --single --ham > /dev/null 2>&1\n<enter-command>set
Rob Weir <rweir at ertius.org> | mlspam at ertius.org | http://www.ertius.org/
GPG keys: 1024D/1E73B7CD, 4096R/3ABDE5EC | Do I look like I want a CC?
Words of the day: India INS blackjack bluebird infowar wire transfer BCCI
More information about the linux