[clug] spamsum usage in the real world

Nemo clug at nemo.house.cx
Tue Nov 14 07:52:36 GMT 2006


Hey all

I've been tackling a small problem of spam (ironically, some of which
claims to be able to get rid of any small problems I might have ;) of
late. 

Currently we're just running spamassassin (via spamd/spamc) over all
messages (many customers, we don't want to teach them how to train
a bayesian filter), but this is a chunky performance hit. 

So I was thinking of throwing tridge's spamsum into the mix - ie, find
some old addresses on the domain which gather a heap of email that can
be assumed to be receiving spam only, and use those to generate spamsum
signatures - which can then be used as a check on all incoming messages
before the resource-hungry spamassassin gets to see them. 

Has anyone else done anything like this - or indeed, heard of spamsum
being used in the wild at all?  A search of the electric google showed
up many spamsum references, but nothing about people actually *using* it. 

For what I'm planning, this seems to be the rundown of pros and cons...

Pro:
* avoid spamassassin heaviness on any messages caught by spamsum - which
is a much lighter-weight utility

Con:
* For any email spamsum doesn't recognise, it's an extra process
spawned, in part negating the resource-saving aspect of running spamsum
at all
* The spam honeypot (spampot) addresses are assumed to only receive
spam. However, they could also receive: 
    [a] personal email 
    [b] chainmail funnies
    [c] newsletters
    ...[a] wont matter since it's spamsum signature is unlikely to match
    anything else anyway. [b] and [c] could be a potential source of
    false-positives however. 
* receiving the spampot messages takes up bandwidth and resources that
could be saved by rejecting them earlier. Potential drive space resources
also (since I'd want to save them for a few days minimum)



Have I missed anything? My thoughts are that the pros will outweigh the
cons and make for more email happiness all around. 

Thoughts appreciated. I feel like I'm heading into some unknown
territories here :)

.../Nemo


More information about the linux mailing list