January 11, 2003

One of the points

One of the points that Jeremy Bowers tries to make, is that once spammers learn how to get past Bayesian-like spam filters, then there is nothing further that we can do. I disagree.

If a spammer's goal is just to get past a spam filter, then I have no doubt that he will find a way to do that, no matter how good the filter is. But a spammer's goal is not just to get past a spam filter, it's to motivate some kind of action from the recipient. In most cases, the desired action is ultimately to get the recipient to buy a product or service. That's why a message like "Here's the link we talked about:", while it will get past every spam filter based on content, may not be a popular form of spam. The response rate may be so low that it can't justify the cost. Therefore, if Bayesian-like spam filters become widely deployed, then we have certainly won a battle against spammers.

But there is still something we can do in the way of filtering. First of all, I think we should find a way to filter based on the IP addresses of the URLs in the message. An IP address is a very small bit of information -- only 4 bytes. A UDP-based server could probably handle a large load of storing IP addresses and responding to queries. Filtering messages based on the IP addresses of the HTTP URLs they contain could be effective.

Keep in mind that even if we don't completely filter every spam message, we can still have a major impact on spam if we just make it difficult to respond to the spam. A URL in a message is just too simple to respond to. If we start filtering based on the URLs in a message, then we can take away that option from spammers.

There are other actions that a spammer can try to motivate, such as replying to a message. There are ways to filter based on the reply-to address, too.

Posted by Doug Sauder at January 11, 2003 10:37 AM