Sam Trenholme's webpage

How I successfully stopped spam from flooding my mailbox

Note: For some reason, this is one of the first pages which come up when people do a search for "Spam Assassin". Spam assassin is actually here. Like Spam Assassin, the software here is targeted for Unix and Linux users; Windows users can only take advantage of this software if they use a Unix or Linux-friendly internet service provider. While Mac users using Max OS X, who have a static IP, are are using one of the Unixy MTAs (Sendmail, Postfix, Exim, etc.) can take advantage of this software, it requires fairly extensive Unix knowledge to accomplish.

Spam fighting for the rest of us

This web page was originally geared for Unix users who want to keep the !@#$ spam out of their mailbox. Thankfully, while some of the most effective spam filtering techniques are not accessible to the average Windows user, there is some common sense which people can use to keep the spam out of their mailbox.

The first is not to make their email address public. Just as telephone telemarketers use the phone book to find people to call, spammers use Usenet postings, email addresses on web pages, whois contact info (when someone gets their own domain, such as my domain, an email address needs to be made public; spammers love spamming these), and other public resources which may have your email address on it to find email addresses to spam. I suggest having two email addresses; one should be public, and used as a clearing house for spam. The other one should only be shared with people you know face-to-face, such as family, friends, and loved ones. This way, spam does not get in the way of email that comes from people who really matter in your life. This is similar to having an unlisted number.

Sneakemail has a service which allows your public email address to constantly change; when a given email address is placed on too many spammer's mailing lists, one can disable the email address in question, replacing it with a new public email address.

Another solution is to use a webmail provider with a good anti-spam filter. Yahoo, for example, provides a free webmail email service which uses a filtering system they use called SpamGuard which filters out 90% of the spam; placing it in a special "junk mail" folder.

As an aside, more than one person has investigated the "Make lots of money working at home" scams. In almost all cases, the people perpetuating these scams are involved with a company called Herbalife; more information.

Spam filtering for Unix users

Kiwi is a spam filter, when used in conjunction with some other scripts, which removes 95-98% of the spam from my inbox. Kiwi is different from other spam filters; it is based on the same premise that Sneakemail is based on, namely that the key to getting rid of spam is making sure that the spammers do not get one's email address.

Kiwi does this by adding an encrypted token which has a very short encrypted message, such as "this email address was sent out in a personal mail on March 9, 1999" or "this email address was generated with the short encrypted message 'sldot' (short for slashdot)". This script can be used in conjunction with an outgoing mail filter which does the following:

  • For email sent to people I know, the outgoing email filter changed the from: email address to a secret, unfiltered email address.
  • For email sent to people I had a working relationship with, the outgoing email filter changed the from: email address to my work email address.
  • For everyone else, the email address was one that timed out after 90 days.
In order to have an email address to fill out web forms with, register software with, put on my whois contacts info, and so on, the program can also encrypt a simple five-letter message, which very tersely describe how this email address was made available.

In addition, it is possible to encrypt a timestamp which indicates that the email address in question was posted to usenet.

I also have some perl scripts which are crude versions of what spam assassin does: they look at the email and determined if it looks like spam or not. If it looks vaguely like spam, the email is placed in a special spam mailbox. These scripts can be used for email address which I want to keep, but are on spammer's email distribution lists.

This is a system which I have used in some form for four years (presently, I do not have a domain which can receive email to do this with), and this program not only essentially eliminated all of the spam my inbox receives, it also has given me a lot of insight of where spammers are getting their email addresses from.

First of all, spammers love harvesting email addresses from Usenet. It got so bad, I had to add a feature to my kiwi software so that it would send all email sent to addresses posted to Usenet through a heavy "spamasassin" filter, and disable Usenet email addresses after two weeks.

Second of all, spammers love harvesting email addresses from the web, but only in places where there are a lot of email addresses. The email address I had available on Slashdot was a spam magnet. Once, I made an email address available on the list of crypto software available; within 24 hours of my email address being placed on this web page, I was receiving pornographic spam. The email address which I had available on one of those high school alumni web pages also received the occasional pornographic spam.

I had a special web page which would encrypt the IP of where someone got my email address; this was used so that I could determine where spammers are doing their web harvesting. This email rarely got spam; when it did, I would post to where the email address was harvested; the apache logs of who accessed this particular web page were permanently stored.

Spammers also like sending email to whois contact email addresses; it got so bad that I had to make a special spam filter for these addresses that only accepted email from email addresses which would send legitimate "your domain has been updated" traffic (as I recall, and, since I manage my domains through netwiz).

Looking back, it was probably a little overzealous to time out email which I sent myself to people I knew. I rarely had a spammer try to email me using an email address that was in the from: line of an email I sent out; the only times they did was when the email in question was mirrored on a web page. When I set up a spam filtering system again, I will have email sent to people I don't know be an encrypted form of the email address I sent the email to; if any of these emails starts getting spam, I will simply disable the address in question. Otherwise, the address will be unfiltered.

Email address which are made available on web forums and other web pages will will have a lot of email addresses need to be run through a spamasassin-type filter; or simply require a three-way interaction for people to send email:

  1. Someone sends email to, say, a Slashdot email address
  2. They get an autoreply: In order to send me email, please send mail to this address; where "this address" is a special email address which is an encrypted form of their from: address
  3. They can now send email to the special address. In case they are a spammer, the email address can be disabled.
Spammers are well aware of the existence of some filters and blacklists; for example, at first a bcc: filter (e.g. My email address has to be in a To: or Cc: line) could eliminate all spam. Spammers starting learning that this was a fairly common filter; as a result, once broadband became available, some spammers went to the trouble of putting in a "To:" header.

Spammers also know that a lot of people are on blacklists such as the RBL; the RBL was an extremely inefficient spam filter. An open mail relay spam filter would probably be more effective; spammers like sending email through open relays. A DUL filter is somewhat effective, but a lot of UNIX power users like using a local copy of postfix/sendmail/qmail to send out email; a better solution is to make a DUL ip 1 point in a spamasassin filter.

Spam, alas, is a fact of life on today's internet. Thankfully, it can be filtered easily enough if one takes care to keep their "real" email address secret; can be used for public email addresses (it is currently a free service).

I hope this helps people develop effective spam filters.