SPAM on List...

Martin Pool mbp at samba.org
Thu Dec 12 01:40:01 EST 2002


On 10 Dec 2002, jw schultz <jw at pegasys.ws> wrote:
> First let me say that Martin (and any others list managers)
> is doing pretty well.  Although there was a breif rise in
> the volumen of spam leaking through during the transition
> it has settled down quite nicely.  This is an arms war and
> I don't expect perfection.  Cudos!

Thanks!

> I can almost second that.  That seems to hold true for the
> last couple of months.  Perhaps html is already blocked.
> I do know that some valid mail may come in with
> Content-Type: Multipart/Alternative where one is text/plain.
> Although i don't like the waste of bandwidth i could see
> accepting that.  It is the stuff that is only html that
> should definitely be bounced.

I've wondered about installing something like mimedefang to handle
these things.  It would be nice to get rid of TNEF attachments too.

I won't start this until we have some experience with the new
stopspam-bogofilter setup.

There are some complications:

 As Tim points out, some people don't control whether their mailer
 sends HTML or not.  So we would need to fall back to html->text
 conversion, rather than bouncing such messages.  This makes it not a
 good way to detect spam.

 Some people need to send patches/log files/whatever to the lists as
 attachments. 

 "What's not there can't break."  Unless it's clearly useful, it
 shouldn't be installed.

Given that some people can't change their HTML setup (not under their
control or too clueless) I'm not sure if notification messages are
useful.

> The other clear indicator that comes up more often here
> seems to be non-english messages.  Care has to be taken not
> to block just because of a few words but if the message is
> mostly non-english or is in a charset incompatible with
> english it should be bounced.

The previous bouncer did explicitly block non-latin character sets.
However, there was a nasty failure mode which caused some non-junk
messages to be blocked.  People writing from (say) China may be using
a mail client that sends messages in a Chinese character set.  Some of
those character sets contain latin characters, so they may have in
fact been writing a purely English message, or perhaps an English
message with a part-Chinese sig block.  

Discarding these messages was incorrect; what was worse was that the
old system gave no indication of how to fix the problem and the
messages were dropped without review. :-(

As an amusing example of going too far in the other direction, a
certain government body has "XXX" as a blackword in their mail filter,
and a single occurrence is enough to cause the messages to bounce.  Of
course people pretty regularly write "XXX" for "don't care" values...
And let's not even think about byte sex. :-)

-- 
Martin 



More information about the rsync mailing list