Personal survey of anti-spam tools

by Michael Alderete on 1/7/2005 · 12 comments

In the three or four years I’ve been fighting unwanted e-mail messages with better tools than the Delete key I’ve tried almost a dozen different tools. This is a quick (ha!) survey of the ones I’ve used, and why I don’t (or do) still use them.

My very first anti-spam tool was something called Mailfilter. I used it for my personal e-mail on Mac OS X, wrote about it here, and almost immediately afterwards lost a non-spam message to an aggressive keyword match. That was the end of Mailfilter. I can’t even remotely recommend it, as it’s just not intelligent enough (strict, single expression matching), and had zero safety net.

My next attempt at a solution was a utility called SpamFire. Like Mailfilter, it is a “pre-filter,” which means it would run before my e-mail client, download my mail, and skim out the spam. Unlike Mailfilter, it actually saved the trapped messages, so if it made a mistake, I could recover the message. It had plenty of other differences from Mailfilter, which I wrote about previously, and which made it so useful that it became the first anti-spam tool I paid for. But in the end I switched to a different tool because SpamFire was separate from my e-mail client, and that made it cumbersome to use.

In the meantime, I had spam coming into my e-mail at work, and at the recommendation of a co-worker, I installed Cloudmark’s SpamNet (now called SafetyBar), an add-in for Microsoft Outlook. It worked reasonably well, but then they went and started charging money for it. It didn’t work well enough, and I didn’t get enough spam at my work address, for it to be worth paying for, so I stopped using it.

I replaced it with another plug-in for Outlook called SpamBayes. Even more effective than SpamNet, it confirmed for me the value of having a tool that plugs into and works directly with your e-mail client. It made dealing with spam seamless, almost effortless (given the lower volume of spam than my home e-mail addresses). Unlike SpamNet, it’s Open Source, and also unlike SpamNet, it doesn’t depend on a third-party server to run. If I still used Outlook (or had a “work” e-mail address), I would still be using SpamBayes. Effective, easy to use, and free. By far the best anti-spam solution I’ve found for Outlook on Windows. Highly recommended.

My happiness with SpamBayes for Outlook lead me to search for an anti-spam tool for my personal use which integrated fully with my e-mail client, Eudora for Mac OS X. I found Spamnix, which wrapped the Open Source tool SpamAssassin in a Eudora plug-in. I fell instantly in love and dropped SpamFire. This was the second anti-spam tool I paid for.

Spamnix was great for a number of reasons. Its interface within Eudora took the form of a new mailbox, and a couple of simple menu commands. Incoming mail judged likely to be spam was shunted off to the new mailbox; the menu commands let you rescue items that were not spam, etc. It was simplicity itself to use, and fairly effective at trapping spam.

The failing of Spamnix was that it was based on an earlier version of SpamAssassin, which while it had some great regular expressions and other traps for catching spam, it didn’t include any Bayesian filtering at all. This meant that spam filtering was good, but never improved. However, it took a new version of Eudora for me to realize that this mattered. (Spamnix has since been upgraded substantially, and uses the latest SpamAssassin with Bayesian filtering; but I’ve not tried it, I moved on to other things, as you can continue to read.)

SpamWatch was a plug-in shipped with Eudora 6, which I paid for, making it the third anti-spam tool for which I shelled out money. It provided me with Bayesian filtering in my Mac OS X mail environment for the first time. It was a revelation. The first time I downloaded mail after upgrading, SpamWatch caught a half-dozen spam messages, and Spamnix caught several dozen. A quick menu choice fed the missed spam to SpamWatch, and forever afterwards the ratio was reversed. I’ll never again think the same way about learning systems; Bayesian filtering really becomes dramatically more effective the more you use it. I had learned this with SpamBayes at my work e-mail, but I simply didn’t get enough spam at that address for it to really shine.

SpamWatch was so effective that it soon spelled the end for Spamnix. I just didn’t need a second spam filter (and the attendant delay in getting my mail while it processes incoming messages) when the first filter was so good. So I uninstalled Spamnix, and life was good.

SpamWatch actually improved my effectiveness in fighting spam, not just by being a better filter. The new version of Eudora added a new field for mail messages, the Junk Probability, a 1-100 score on how likely a message was judged to be spam. I quickly learned to process my Junk mailbox by first sorting by the Junk Probability, and then scanning for false positives. By sorting first, the messages at the beginning of the list were a lot more in need of review than messages at the end, making it possible to skim more rapidly over items that were certainly spam.

I don’t really remember why I decided to give SpamSieve a try. I do remember noticing that SpamWatch seemed to have hit a plateau, and wasn’t improving any more. I had friends who used and liked SpamSieve, and were getting better stats than I was with SpamWatch, but the interface between SpamSieve and Eudora was, for a long time, through AppleScript, and the integration wasn’t smooth enough. Probably when the version that used a Eudora plug-in came out, I decided what the heck, it won’t cost anything to try it a couple days, and see how it does.

It was not the revelation that SpamWatch was, but it only took a training pass at my archived spam and Inbox (a few thousand messages between them), and a couple of days use proved that it was noticeably better than SpamWatch. After a week I was sold, and opened my wallet, a fourth time, taking my anti-spam expenditures over $100. (Compared to the problem I have with spam, that’s not a lot of dough, money well spent. But it chafes when you realize that the spam problem is entirely due to greedy amoral scumbags who have polluted the e-mail highways and byways to make a few cents per pound of pollutant spread.) I also disabled SpamWatch, since SpamSieve entirely replaced it.

Well, not entirely, not at first. Although SpamSieve had a more sophisticated and accurate Bayesian engine at its core, there was one thing that it didn’t do well, which was assign a Junk Probability to each message. This meant that I lost a very effective tool in my Junk folder processing, of being able to sort the caught messages from least to most likely spam.

I e-mailed Michael Tsai, the developer of SpamSieve, explaining how I used that feature of SpamWatch. He was highly responsive, and agreed that my approach sounded genuinely useful. Unfortunately, at that time SpamSieve’s engine’s algorithms and formulas didn’t really generate a score that could be usefully assigned to Eudora’s Junk Probability column. The math just didn’t work that way. He said he’d look into it, and perhaps in the future, perhaps it might do this.

Fortunately for me, the future did eventually come to pass, and SpamSieve now does assign useful probability scores to processed mail. Its final fault gone, and with the best scoring engine and perfect integration with Eudora, it is by far the best tool I’ve used for managing the amount of spam I get, and I’m happily using it today. Compatible with virtually every e-mail client running on Mac OS X, it totally deserves its reputation as being the best anti-spam tool available on my platform of choice. Highly, highly recommended.

After all that, you’d think I’d be wrapping this posting up. You and I both wish that was true. But no. I’ve got five more tools to write about.

First is Thunderbird, the stand-alone e-mail client that evolved out of the Mozilla suite. It’s available for Mac OS X, Windows, and Linux, and other platforms, too, I think. It’s a nice enough e-mail client, and one of the best for sending and receiving HTML e-mail, if you’re the sort of person who likes that kind of thing.

It has gotten quite a bit better since I evaluated it, to the point where I installed it on Rochelle’s PC, as a replacement for Netscape Communicator, which she had been using since the mid-‘90s. It has a built-in Bayesian spam filter, which works well enough. It’s definitely not as accurate as other Bayesian classifiers I’ve used; SpamBayes and SpamSieve are both quite a bit better, and I think Eudora’s SpamWatch might be a little better. But it’s more than good enough for Rochelle, who gets an order of magnitude fewer spam messages than I do.

Another advantage of Thunderbird is that you’re not using Outlook, which is the number one attack vector for viruses and worms. It’s a good tool, and if you’re stuck on Windows and can’t do better, it’s definitely worth switching to. Recommended.

Another e-mail client I haven’t used much is the bundled OS X e-mail client, Mail.app. It doesn’t work the way I do, and doesn’t seem to be well-suited to managing the volume of mail I get on a daily basis. Supposedly it has good junk mail controls (which use latent semantic analysis), but in the testing I did with it on a secondary e-mail account, it didn’t seem that good, somewhere around the Spamnix level of accuracy. That is, very good, but not excellent. If you like Mail.app, since you can use SpamSieve with it, that’s what I’d recommend doing, and blow off the built-in junk mail controls.

Finally I come to the server-based tools. I’ve saved these for last, not because I tried them last, but because most people won’t be able to use them themselves. Unless you run your own mail server, most of these are impossible to use.

The first server-side solution I tried, and used for quite some time, was what are known as RBL lists. RBL stands for “Real-time Blackhole List”, and the way it works is as mail is received, the mail server sending the message to your server is looked up on a list, which contains known-bad e-mail servers (ones that are known to send spam). If the foreign server is on the list, the e-mail is rejected, in real-time.

In theory this should be extremely effective, and indeed it does cut down on spam considerably. But not completely, and not without collateral damage. There are just too many servers out there that are sending spam, most of them home PCs that have been infected with a worm or virus that converted it to a “zombie” that sends out millions of spam e-mails. It’s impossible to keep the lists current.

It’s also impossible to keep them accurate. There’s no way to maintain the lists with perfect accuracy, mistakes are inevitable, due to both ignorance and malice on the part of people submitting candidates. In the end I had too many messages rejected that were OK, and I had to turn the RBLs off. (I may turn them on again, I go back and forth on the “damn the consequences” philosophy…)

I also tried MIMEDefang, which was a wrapper for SpamAssassin and Vipol’s Razor, with a plug-in for my mail server, Sendmail. I ended up losing mail to this solution, when it would generate errors under load. The failure was intermitent, and impossible to reproduce on demand. Since the mail would just get dropped on the floor, completely lost, I decided I couldn’t afford to try to track down the issue, and simply uninstalled all of it. I’ll surely try another wrapper for SpamAssassin at some point, when I’ve converted my mail server to Postfix, for which it is easier to write plug-ins, and should therefore be easier to write reliable plug-ins.

The last server-side tool I’ve been using doesn’t live on my server. Maybe that’s why it’s worked so well. I wrote about the amount of spam that was coming to my oldest e-mail address at pobox.com, and how the new filters they rolled out saved the address from deletion. I’m still really happy with the results of that service, and plan to keep my pobox.com address for the foreseeable future. Recommended.

Brian Mikol January 7, 2005 at 3:53 pm

Thanks again for the very thorough and worthwhile review Michael! While out of the multitude of email addresses that I have, yahoo is the worst, your recommendations are very much appreciated.

Nice to see you got commenting working too! :-)

Alderete January 7, 2005 at 7:36 pm

Brian, yes, I got comments working. It was a problem in the Kubrick wp-comments.php file, which I’m working around by using the default WP 1.2.2 version. Hope to fix the problem eventually, and use the nicer Kubrick comments form…

Alderete January 7, 2005 at 8:44 pm

OK,there were some errors with the wp-comments.php file that comes in the Kubrick distribution, which I have now fixed. So you’re now seeing the comments form as the designer intended.

trench January 7, 2005 at 10:31 pm

Glad you figured things out. Nice site!

Joe January 9, 2005 at 8:29 pm

Oh man. JunkMatcher doesn’t get a mention. For people like me who use Mail.app, it’s really excellent. The combination of JunkMatcher and Mail’s internal spam filter have been very, very good to me.

http://junkmatcher.sf.net/

Alderete January 10, 2005 at 8:55 am

Joe, glad you found something that works for you! There’s certainly a lot of products out there, and that’s a good thing, diversity means the spammers have a lot harder time trying to break through. I haven’t used JunkMatcher myself, and since this is just a survey of the tools I’ve personally used, I didn’t mention it.

samiam May 28, 2005 at 7:21 am

You can also try the avoidance route; there is a website called spamhole.com (there are several clones around as well) that creates a temporary forwarding email address that lasts a specified time and then dumps it when the time is up.

Its simpler than installing software if you’re not too technically inclined.

Ganesh October 6, 2005 at 6:21 am

Came across ur review of antispam tools for desktop/server. would suggest you try “openprotect”:http://openprotect.com - its OSS, and free if you dont want KAV and commercial support. it supports sendmail, postfix, qmail and exim and installs in a jiffy !

La Vie Viennoise November 1, 2005 at 6:11 pm

Nice roundup. I might give SpamSieve a try, although SpamWatch is doing pretty well for me.

While on the subject of spam, there is a great service called Emailias, which allows you to generate a unique email address for every (dubious) web transaction/newsletter. These addresses look like aldoblog@emailias.com or applepro@emailias.com.

What’s neat about emailias is that you know exactly who has sold or leaked your email address when the spam starts coming in and you can turn that address off at any point without having to register for a dozen ezines (if you have only one disposable address).

I have hundreds of these emailiases now, they all work great without any downtime. I have had to disable about ten in the three years I’ve been using the service. And been delighted to be able to do so when the occasion arises.

Charley November 16, 2005 at 9:03 am

Another vote for JunkMatcher - it uses the SpamBayes engine so far as I know, and has been working great for me. All things considered I prefer to use open-source stuff whenever possible. I just wish there were a plugin this good to use on my windows machine at the office.

matt May 18, 2006 at 7:01 pm

Hey, I noticed you don’t talk about email address encoders. I saw one at http://www.addressmunger.com that was pretty nice.

Comments on this entry are closed.

{ 1 trackback }

Previous post:

Next post: