latent-semantic-analysis

Personal survey of anti-spam tools

by Michael Alderete on 1/7/2005 · 12 comments

In the three or four years I’ve been fighting unwanted e-mail messages with better tools than the Delete key I’ve tried almost a dozen different tools. This is a quick (ha!) survey of the ones I’ve used, and why I don’t (or do) still use them.

My very first anti-spam tool was something called Mailfilter. I used it for my personal e-mail on Mac OS X, wrote about it here, and almost immediately afterwards lost a non-spam message to an aggressive keyword match. That was the end of Mailfilter. I can’t even remotely recommend it, as it’s just not intelligent enough (strict, single expression matching), and had zero safety net.

My next attempt at a solution was a utility called SpamFire. Like Mailfilter, it is a “pre-filter,” which means it would run before my e-mail client, download my mail, and skim out the spam. Unlike Mailfilter, it actually saved the trapped messages, so if it made a mistake, I could recover the message. It had plenty of other differences from Mailfilter, which I wrote about previously, and which made it so useful that it became the first anti-spam tool I paid for. But in the end I switched to a different tool because SpamFire was separate from my e-mail client, and that made it cumbersome to use.

Read the rest of this entry (2,185 words) »

{ Comments on this entry are closed }

Latent semantic analysis is not Bayesian filtering

by Michael Alderete on 5/4/2003

Macworld recently ran an article about anti-spam tools for Mac OS X, which incorrectly simplified the world of anti-spam tools down to Boolean, points-based, and Bayesian filters.

Two additional categories are distributed recognition, such as the Distributed Checksum Clearinghouse (DCC) and Vipul’s Razor, and latent semantic analysis. I don’t know of any distributed recognition products for the Mac (there’s a very good one for Windows Outlook, SpamNet by Cloudmark), but there certainly is a latent semantic analysis tool — Apple’s Mail in Jaguar!

The simplification (or oversight) is relatively understandable. From an end-user perspective, there’s no meaningful difference — even though the math is very different. It’s not clear which will prove better at filtering out spam, even though in the article Mail’s filtering did the best. Seems like it’s good to have both in the fight!

While I’m posting about it, I should note that the article was written prior to the release of my new favorite anti-spam tool, Spamnix, and so it doesn’t include it in the roundup. From my own experience with Mac OS anti-spam tools I think that, with the caveat that it only works with Eudora, it would have done well in the evaluation. Perhaps Geoff Duncan, or someone else at TidBITS, will review it soon, and confirm that guess. I know they like Eudora at TidBITS — they literally wrote the book!

{ Comments on this entry are closed }