Failsafe Cluster for critical apps

author-image
PCQ Bureau
New Update

Fails safe clusters are becoming critical components of the
datacenter, in an effort to maintain uptime. Basically these are nothing but a
set of servers who keep a continuous eye on each other and if by any case, one
goes down, the other starts doing its job. To the outside world the job at hand
continues uninterrupted. For instance in a  failsafe cluster, if you have a
web server running with the IP address 10.10.10.1, and it goes down, then
immediately another machine in the cluster will be alerted to this fact, and
it will change its IP to 10.10.10.1 and start acting like the server that went
down, so that web pages continue to be served uninterrupted.

Configuring the network interfaces of your failsafe cluster.

This article is about setting up such a cluster using
PCQLinux 2006. The software and prerequisites are included in PCQLinux 2006.

Before you start

In this article, we will explain how to create a two node fail-safe cluster
using Heartbeat. So, the first thing which you will need is two machines: one to
be the master node and the other to be the backup node. We set the fully
qualified domain name of our master node as master.pcquest.local and the name of
the backup node as “backup.pcquest.local'.  You can get the two machines
networked for failsafe clustering using two network cards on each machine and a
cross-wired CAT 5 cable. Obviously, this is just for demonstration and as you
add more machines to the cluster, you will need a more elaborate clustering
network.  Two of the network cards, one on each machine, will connect the
nodes to the network, while the other two will connect the nodes to each other
with the cross-wired cable. This will be used for determining the Heartbeat.
This is the one that senses when a machine in the cluster is down and brings the
other up.

Once you have the hardware up, install PCQLinux 2006 on
both the machines, choosing installation type 'Advanced Installation'. On
the next screen, select the Fail Safe Cluster option and follow the installation
normally. The installation will ask for the first and the second CD only. Let's
say, the IP address of your master node's network card, which is connected to
the other failsafe node with the cross-wire is 10.10.10.1 (/dev/eth1) and for
the other node it is 10.10.10.2. And on the network cards, which are connected
to the main network you can either take the IP address from the DHCP server or
you can give the IP address manually. We gave 192.168.3.27 (/dev/eth0) to public
network interface on the master node and 192.168.3.28 to the public network
interface to the backup node.  Now, you will need to configure an alias for
the public network card of the master node, (which is connected to the main
network) and give another IP address to it. To do so, start the
redhat-config-network applet and add a new LAN card. Here, when asked, select
the same network card which is connected to your main network. This card will be
given a device name, such as eth0:0. Assign an IP address to it such as
192.168.3.199. But make sure the IP is free and not taken up by any other
machine in the network. This will be the IP address of your cluster and your
service will respond to it.

Configuring the cluster

This involves modifying three files: /etc/ha.cf, /etc/haresources and /etc/authkeys.
But the default files are copied to /usr/ share/doc/heartbeat-2.0.2/. So, you
have to first copy all these files from here to /etc and then start modifying
them. The first file called ha.cf consists of all the settings for your nodes.
You have to modify the following tags in this file for both machines.

node master

node backup

deadtime 30

warntime 10

bcast eth0

auto_failback on

If you have more than one backup node in the cluster, then in the ha.cf file
on each machine, you will enter the full list of machines in the cluster

node master

node backup

node backup_two

node backup_three



and so on

Next open haresources from /etc and provide the IP address
as follows:

master.pcqlabs.net
192.168.3.199 httpd smb

This file should be identical in both machines. You will
notice that only one line is added, in which we are assigning the DNS name of
the master node and IP address of the cluster and services, which will run over
Heartbeat. This means that now your network will have a failsafe cluster of
Apache and Samba running over 192.168.3.199 (the IP allocated to the alias of
the network card connected to the main network). Finally, copy the file authkeys
from the documents directory to the /etc directory in both machines and restart
them.

Testing the cluster

Start up the servers and once they are fully functional, just shut down your
master node and wait for 30 seconds. Then go to the backup node and run the
ifconfig command. You will find that the alias eth0:0 is already created and the
same IP (192.168.3.199) is assigned to it.

Note: In this article we haven't discussed how to configure
your individual services such as SMB and httpd. For more details on them you can
read the article setting up Network Services, page 56, PCQuest, March 2004. The
only thing to keep in mind is that you need exactly the same configurations for
all the services in all the nodes. Plus, this setup will only create the backup
of the IP settings of the server. So you have to have some machinists to
replicate all the crucial data in all the cluster nodes as well. for instance if
you have a File server running in Samba and you are using it over heartbeat for
failsafe then you also have to have an exact replication of the directory
structure of the shares and user rights. So when the master node fails the data
will also be available n the backup node/s. For this you can either use rsync
over crond or use any other backup mechanism. For more details on rsync read Aug
2003 issue of PCQuest.

Stay connected with us through our social media channels for the latest updates and news!

Follow us: