If
you have an e-mail account, you’ll soon find yourself getting mail that
you haven’t asked for, and don’t want in your inbox–in other words,
spam. What is Spam? In 3D "meatspace", it’s a luncheon meat
manufactured by Hormel Corp (which also owns www.spam.com).
Spam, on the Web, is unsolicited, unwanted e-mail, frequently sent in bulk,
usually advertising some commercial proposition. Most of the spam you
probably get, and what this article deals with, is BUCE (Bulk Unsolicited
Commercial E-Mail).
If you have a Linux (or *nix)
box, you have a set of powerful tools to stop all this spam from cluttering
your inbox. These tools are even more useful to you if you run a production
mail server and want to stop spam from reaching your users.
The three cardinal rules of
spam fighting are:
-
Prevention is better than
cure. Arm yourself against spam -
Filter spam before it
reaches your mailbox -
Complain to the spammer’s
ISP and get him shut down.
Prevention
Protect
yourself, and prevent spammers from harvesting your address. Don’t expose
your primary e-mail address where a spammer can get it and add it to his
list. This includes places like /., IRC, Usenet, mailing lists, Web-based
bulletin boards–in short, anywhere online. Instead, follow one of these
steps.
-
Use a
"throwaway" address (say abcde@hotmail.com)
when posting. If you find that this address is getting spammed, you can
just throw it away and switch to another address. To be on the safe
side, when you’re posting to Usenet, Slashdot, other Web-based boards,
"munge" your address to something like abcde @hotmail.com.nospam.
Obviously, spammers, who use robots to crawl the Web searching for mail
IDs and burn the entire thing into a CD, won’t be able to mail you.
-
If you run your own
domain, use "expiring" mail addresses–addresses which will
be valid for a particular period–say a week, a month, or a year–and
will then cease to exist. This address can be something like me-mar31-apr31@mydomain.com.
In case you don’t have your own domain, check out www.mailexpire.com,
which provides a similar service for free.
-
However, both these
measures have a major drawback–you have to keep changing your e-mail
address. If your ISP uses sendmail, you have another option–plussed
addresses.
Plussed addresses are
available with newer versions of sendmail (8.8 and above). Just add a plus
sign and any string you want after the username and before the
"@", and the mail will still be delivered properly. For instance,
me+ foo_bar @myisp.com will reach me–sendmail will ignore everything after
the plus sign. However, before you start using plussed addresses in your
e-mail, send yourself a test mail with a plussed address and check whether
it reaches you. This is useful to discover just where a spammer harvested
your mail ID from.
For example, if you subscribe
to the Linux India Help mailing list, subscribe to it as you+lih@yourdomain.com
(and make sure you set your mail client to post messages to the list using
only this identity, or the list will bounce your mails). Both PINE and Mutt
allow you to use different identities when posting. This is done through
roles in PINE and folder hooks in Mutt (see box "Roles in PINE and Dave
Null"). Another advantage is that if you start getting lots of spam to
a plussed address, you can just send mail reaching that address to Mr Dave
Null (aka /dev/null).
To understand what logrotate
can do, first ask yourself what you want to do with your log files. The
table "Planning for a log processing and archiving policy" might
help you to start. The first row lists the processing and reporting to be
done, while the first column lists the files on which the processing is to
be done. Put down the different log files in column 1, tick out the log
processing of your choice, and you can come up with a policy for using
logrotate.
Let me briefly explain what
each column implies. A "yes" on column 2 indicates that you want
to retain the log file as a record, so it’s best kept compressed.
Similarly, a "yes" in column 3 indicates that you merely want to
scan the file, look for the unusual, and then discard it. You might want to
mail this file to yourself or to the relevant administrator. Column 4 says
that you want to discard the file straightaway. In the sysadmin world, this
obviously doesn’t qualify for best practice. Columns 5 and 6
mention the actions you want to perform before and after you do the log
processing. Column 7 is for an e-mail address to which errors during log
processing are to be reported, and column 8 indicates how often you want the
processing to be done. Note that you might want a time threshold with a
granularity of a day or choose to have a file size threshold to rotate the
logs. This table is not exhaustive or mandatory in nature–it’s is merely
an example of how you would go about the policy-making exercise. So, don’t
implement this, as is, as a policy. Evolve one to suit your needs.
If you’re ready with a
table such as the one above, you have a policy. You can now use logrotate to
implement this policy.
The policy is specified using
keywords, as well as with a script-like language comprising keywords
specific to logrotate. The script is intuitive and easy to understand. By
default, most logs are rotated four times, uncompressed, before they’re
removed from the system. This should explain the presence of files with the
extensions .1, .2, .3 and .4 in the /var/log directory. Take the file /var/log/messages
as an example. After a certain time period or after a certain file size is
reached (as specified in /etc/logrotate. conf), this file is renamed to
messages.1 and an empty file called messages is created to take in the new
log input. This is repeated until they’re rotated four times.
Let’s look at a portion of
the configuration from /etc/logrotate.conf from a standard install. The
first line mentions the name of the file for which the policy is laid out.
Notice the intuitive keywords–"monthly" indicates that the
rotation cycle is monthly, "create" specifies the permissions and
ownerships to be used when the old file is moved to another name and an
empty file is created. "Rotate 1" indicates that one rotated
logfile will be retained:
/var/log/wtmp
{ monthly
create 0664
root utmp
rotate 1
}
Here’s a portion of the
file /etc/logrotate.d/apache–the policy for processing apache log files.
The keyword missingok implies that if the log file isn’t found, continue
processing the rest. Notice the command in between the keywords postrotate
and endscript. This command is executed after log processing is done.
Surprisingly, you don’t find any other instructions such as the frequency
of rotation or the number of rotations, as in the previous case. When there’s
no explicit mention made, the definitions in the global configuration file
will apply.
/var/log/httpd/access_log
{
missingok
postrotate
/usr/bin/killall -HUP httpd 2> /dev/null || true
endscript
}
logrotate is typically run
once a day by the cron. If you are logged in as superuser, you would see an
entry similar to the one below in the crontab file:
0 0 * * * /usr/sbin/logrotate
The utility runs every
midnight. You can run it more often if you need to.
A good start towards
minimizing disk storage space would be to uncomment the compress option in
/etc/logrotate. conf, so that all the rotated log files are kept compressed.
Avinash
Shenoy is a systems and network administrator at the NCBS, Bangalore,
and Gopi Garge is a technology
consultant with Exocore Consulting <www.exocore.com>