Cure Your IT Services With Monit

author-image
PCQ Bureau
New Update

Monitoring software send alerts (typically e-mails) in case of failure of one
or more services. The time taken to fix the problem is the performance indicator
for the system administrator. This becomes even more critical when the problem
has resulted in disruption of deployed and live services. What if, the
monitoring software itself could fix the problem(s), as and when they arise? The
idea being, what does a system administrator do to fix an issue? He/she may
execute a couple of commands, may edit configuration files or restart the
services. Given all these actions as a batch, a software can execute them
against the affected system to fix a problem. This is where Monit comes to help.
Besides monitoring the system, Monit can be instructed to take custom actions on
specific alerts corresponding to problems/failures in services.

Monit is defined at its website http://mmonit. com/monit, as a free Open
Source utility for managing and monitoring, processes, files, directories and
file systems on a UNIX system. So, we decided to check its utility. The setup
and examples in this article have been executed using Monit version 5.0.3 on a
machine running Fedora 12.

Installation and basic configuration

Installing Monit is as simple as issuing the following:

yum install monit

Open the file named monit.conf found in /etc/monit and append the following
lines to it:

set alert admin@foo.com

set mailserver localhost

Replace admin@foo.com with the e-mail address to which you want to send
alerts. In case, no SMTP server like sendmail or postfix is running on the
machine (which runs Monit), replace localhost with the name/IP address of a
machine running an SMTP or mail server on your network.

To specify more than one recipients you can append lines,
as follows, to the monit.conf file:

set alert admin1@foo.com

set alert admin2@foo.com

Start the Monit service as: service monit start

Subsequently, we can setup the various services to monitor
by writing Monit configuration file(s) in the direcory /etc/monit.d/. Some
examples are as follows:

1. Monitoring a service

Lets take Apache web server as an example. Suppose Apache server is
overloaded with connections, not responding or has died, the following Monit
configuration can detect it.

check host Apache with address localhost

if failed url

http://localhost/index.php

then alert

Save the above configuration in a file named Apache in the
directory /etc/monit.d. Then reload Monit as:

service monit reload

Assuming that Monit is running on the same machine as the
Apache web server, the above tries to fetch a URL http://localhost/index.php.
When not able to fetch, Monit sends an e-mail alert to the e-mail address(es)
configured in monit.conf (as explained above). The e-mail alert looks something
like this:

Subject: monit alert -- Connection failed Apache

Connection failed Service Apache

Date: Tue, 16 Feb 2010 17:32:33 +0530

Action: alert

Host: laptop.it4enterprise.net

Description: failed, cannot open a connection to INET via TCP

But wait, didn't we say that Monit can alert as well as fix
the problem. In this case, Monit should perhaps restart the Apache web server.
For this modify the configuration in /etc/monit.d/apache as follows:

check host Apache with address localhost

start program = "/etc/init.d/httpd restart"

if failed url

http://localhost/index.php

then restart

Issue service monit reload. And voila, Monit will send the
alert as before, but this time it will also restart Apache. What's more, in a
few seconds, it will drop an e-mail informing about the successful resumption of
the service. The e-mail looks as follows:

Subject: monit alert -- Connection succeeded Apache

Connection succeeded Service Apache

Date: Tue, 16 Feb 2010 17:47:26 +0530

Action: alert

Host: laptop.it4enterprise.net

Description: connection succeeded to INET via TCP

2. A failover setup

For this example, let's assume Monit is running on a separate machine (say
192.168.2.1) and monitoring an application server. To check whether the server
machine is up and running, a Monit configuration file looks as follows:

check host appserver with address 192.168.2.100

if failed icmp type echo count 3

then alert

Save the above configuration in a file named appserver in
the directory /etc/monit.d. Then reload Monit. The above script will check
whether the application server (called appserver) with IP 192.168.2.100 is alive
i.e. responding to ping. If not, it will send an alert as follows:

Subject: monit alert -- ICMP failed appserver

ICMP failed Service appserver

Date: Tue, 16 Feb 2010 18:12:07 +0530

Action: alert

Host: laptop.it4enterprise.net

Description: failed ICMP test

So far so good. But what will be the remedial action? Lets
assume that there is a backup/mirror server running at 192.168.2.101. So what if
we create an network alias to it with the IP set as 192.168.2.100. Subsequently
all traffic to 192.168.2.100 (which is not accessible) will land at the backup
server. For this we will SSH into 192.168.2.101 server and issue the following
command:

ifconfig eth0:0 192.168.2.100

But we will automate this via a shell script which will
used by the app server Monit configuration. The shell script is as follows:

#!/bin/bash

ssh root@192.168.2.101 "ifconfig eth0:0 192.168.2.100"

Save the above in a file named add_alias.sh in the
directory /opt. Give the file executable permissions as:

chmod +x /opt/add_alias.sh

SSH password prompt can be suppressed by using SSH keys
(generated by ssh-keygen). Refer to the tutorial at http://rcsg-gsir.imsb-dsgi.nrc-cnrc.gc.ca/documents/internet/node31.html
for a password-less SSH login. Next modify the file /etc/monit.d/appserver as
follows:

check host appserver with address 192.168.2.100

if failed icmp type echo count 3

then exec "/opt/add_alias.sh"

Note the use of “exec” to execute an external command or
shell script. One can debate that something like heartbeat is better to use for
such a failover setup. While this is true, we are showcasing that if you have
Monit installed, you can use it for this purpose too.

3. When the storage space goes low

Suppose one of the servers is using a storage volume which is about to get
exhausted — reached about 99% of usage. Monit can monitor this and also assign a
larger volume for storage. Following shall be the Monit configuration:

check filesystem storage with path /mnt/storage

if space usage > 30%

then exec /opt/assign_premium_storage

The assign_premium_storage script can be something as
follows:

#!/bin/bash

# mount the premium (larger) volume on /mnt/storage2

mount -t cifs //storage-server/premium-volume /mnt/storage2 -o username=storageadmin,password=secret

# copy all the files from the current storage to the new storage

cp -R /mnt/storage/* /mnt/storage2

# umount the storage

umount /mnt/storage

umount /mnt/storage2

# mount the new premium storage onto the old location to make it totally
transparent to the user

mount -t cifs //storage-server/premium-volume /mnt/storage -o username=storageadmin,password=secret

In the above script, it is assumed that the network storage
volume is mounted at /mnt/storage.

With the power of custom scripts in hand, you can use Monit
even for intricate tasks. For instance, when the load of virtual machines in a
cluster goes high, it can automatically provision another virtual machine to
share the load!

Next-Open
Source Bandwidth Monitoring With MRTG

Stay connected with us through our social media channels for the latest updates and news!

Follow us: