Trends Watch

(Un) Serving Spam

PCQ Bureau

05 Jun 2003 05:05 IST

New Update

Enabling filtering at your email client makes sure that your Inbox is not cluttered, but it still doesn’t resolve the big problem of wasted Internet bandwidth. It, therefore, becomes important to fight spam at your mail server. The most basic technique is to use rules-based filtering that’s built-in most mail servers. Unfortunately, this is not the most effective method for two reasons. One, its accuracy level is pretty low, and two it would be a nightmare to keep updating the filtering rules. That’s why one must use third-party mail filters that use more sophisticated filtering mechanisms. Both commercial and free mail filters as well as the

software are available.

Advertisment

How Mail Filters Work

Classifying mail as Spam may seem simple to a human, but automating this process using software with 100% accuracy is not possible. However spam has various tell tale signs that can indicate its legitimacy within acceptable levels of accuracy. These signs include information contained in the mail headers and body text. When filters do content analysis, they scan for text that is generally present in Spam. Their algorithms look for words commonly found in spam like weight loss, debt relief. However this sort of exact matching can also lead to inaccurate results as such hit words can also be present in regular mails. More sophisticated algorithms work in terms of probability. The likeliness of a word being present in spam is generated by statistically analyzing large amounts of spam. Then combining the probabilities of individual words in the mail is added up and if the probability crosses some previously determined threshold limit then the mail is termed as spam. This increases the accuracy and reduces the number of false negatives (legitimate mail marked as spam), but it still cannot catch all

spam.

Bayesian filtering seems to be a hot topic of research in this field. SpamAssassin, Perl based filter uses such heuristics.

There are also filters that are distributed in nature, and a fairly popular example is Razor (on this month’s DVD). This software generates a checksum of mails and sends them across to a server for checking. This technique is theoretically more reliable than blacklists but also depends on the number of users accessing and updating the database. Since this concept is relatively new, the databases available online are not able to detect spam in its entirety.

Advertisment

Using Blacklists

As the name suggests, these lists contain names of known spammers. A mail server would look up these lists and block the known spammers in it. Although effective, this list could also end up blocking any legitimate mails being sent to your server. Therefore it’s imperative to maintain the sanctity of the blacklist. These blacklists are available on the net as paid and/or free services accessible via DNS records or as Real time blackhole lists accessible directly by the mail server. These lists are populated by contribution of mails identified as spam by list users. DNS Blackhole Lists or DNSBLs are a way to filter spam by using Domain Name Service (DNS) records as a database of policies relating to an IP address or domain name, which can be used to decide whether or not to accept (or label) email. This type of mail filtering mechanism is almost exclusively used in server side mail filtering. Examples of popular blacklists on the net are available on

http://www.india.cauce.org/index.html?blacklist

Placing the Filters

Mail filtering software can be setup in two places: either be between your mail server and clients or between your SMTP server and the mail senders. The first setup prevents spam from reaching the clients, and the other prevents clients from sending out any spam. The first setup is suitable in an organization, while the second is more suitable for an ISP. In both cases, the filtering software can be placed on a separate machine since it’s a resource intensive process.

The SMTP setup can be used where clients with dynamic IP addresses try to send spam. For instance, suppose a spammer gets an ordinary dial-up account from an ISP and tries to use it to send spam through the ISP’s SMTP server. In this case, the client would get a dynamic IP address every time it connects to the ISP. Since the filtering software would therefore monitor all mail going through the SMTP server and filter out spam. We’ve covered how this can be setup on Windows in a separate article in this story. Tools are also available for the UNIX environment. These are basically barebones in terms of interface but extremely powerful in their

configurability.

Beyond Filters

Spam fighting does not stop at installing the filters and configuring it work with your mail server. Precautions need to be taken while configuring the mail server. The biggest problem is the presence of mail servers with open relay enabled. These allow third-party users to relay mail though them. Such an act loads the mail server and can hide the identity of the spammer. Open relay mail servers were useful when a direct route to particular servers didn’t exist. They could relay mail to these servers through them. Spammers sometimes also try to bypass an ISP’s mailing system (which may catch Spam) by running SMTP servers directly on their dynamic IP PCs and send mails directly to victims. In such cases, the ISP’s have to take a stand and filter all packets on port 25 (default for SMTP) except those directed straight to ISP’s SMTP server.

Ankit Khare

Advertisment