Advertisment

High Performance Clustering Part 4

author-image
PCQ Bureau
New Update

This time instead of doing a single topic on clustering,

we've actually done two. One is based on our good old 20-nodes cluster, and is

about a topic we promised to bring to you two months ago. It's about building

a storage cluster, which is basically a NAS running across multiple commodity

machines. While the idea sounds adventurous, its setup is equally challenging,

so get ready for some adrenaline pumping action. Our second topic is a preview

of an upcoming technology, called the Microsoft Compute Cluster 2003. This is

Microsoft's first initiative to bring out a software to create HPC clusters

out of commodity hardware. The final release is expected sometime in June this

year.

Advertisment
Direct

Hit!
Applies

to:
CIOs 
USP:

Build a clustered NAS using open filer and open mosix and a Compute Cluster of 64-bit PCs with MSWindows Compute Cluster 2003 Beta 
Links:

www.microsoft.com/windowsserver2003/hpc, www.openfiler.com 
Google keywords:

Compute + Cluster + MS, openfiler, openmosix

Building a clustered NAS



We'll use one of the most popular Open Source NAS Operating Systems called

OpenFiler and build a custom kernel for it, which has a OpenMosix patch. This

will enable us to add multiple compute nodes to the NAS box, which can make any

kind of batch processing on the device faster. You can go for aggregating the

storage of all the boxes for the NAS devices by using MFS. But we won't advise

that because this will reduce the data redundancy significantly. But if you have

hardware or software RAID configured on all the cluster nodes, you can also try

that out.

The installation







The first time we did an article on OpenFiler was about a year and half ago, and
at that time, it was so primitive that the only way of installing it was by

first installing a RedHat 8.0 installation and then installing a huge set of

RPMs manually on top of that. But today OpenFiler has its own customized

distribution. This distribution is based on CentOS and comes on a single CD. You

can download it from http://www.openfiler.com/download/

. Installing it is pretty easy and the installation looks very much similar to

the installer of PCQLinux/Fedora. All you have to do is boot the machine with

the CD and follow the graphical installer. One thing you have to keep in mind is

that for OpenFiler to work properly you will have to create the partitions

manually. Basically you have to create three partitions. But before that

remember to delete all other existing partitions. The partitions you have to

create are /boot, swap and /. While creating the partitions make sure that both

the /boot and / are in ext3 format.


Advertisment
While preparing the partitions for

openfiler, make sure that you have all the partitions exactly in this format

The size of the /boot partition should be at least 128 MB

and the size of the swap partition should be double the size of the amount of

RAM you have in your machine. And the size of the / (root) partition should be

at least 4 GB. Remember that this partition is just for the OS to work. The

partitions and disks that will be used over the NAS should be different.

After you are done with the partitioning, follow the

installation exactly as you do with the PCQLinux installation. The installation

will take around an hour depending on your machine's speed.

Advertisment

Now start the following services as follows:

#service

postgresql restart



#service httpd restart


#service openfiler start

Now your first openfiler node is up and running. The next

thing to do is to build an OpenMosix kernel on top of your open file node.

Advertisment

Installing OpenMosix 



The openfiler installer CD doesn't comes with kernel source and for

installing OpenMosix on top of it you will need to recompile the kernel. So

first of all you have to find out the exact version of kernel that's running

on the OpenFiler OS. For this run the following command

#uname —r



This will show you the exact version of the kernel of

Openfiler you are using. Now what you have to do is to first download the source

of the exact kernel version. You can find the exact source from http://kernel.org.

After you're done with this you have to download the openmosix tarball. This

is basically a kernel patch. But here one thing you have to make sure is that

you should download the openmosix patch of the exact version of kernel, which

you have with your OpenFiler.

Advertisment

The next step is to recompile the kernel with the openmosix

kernel patch. For this follow the steps given below

Step 1--Uncompress the Kernel source tar ball by

running the following commands:

#cp

linux-2.4.x.xx.tar.gz /usr/src



#tar —zxvf linux-2.4.x.xx.tar.gz



Advertisment

Step 2--Get all the settings and configurations from

the old kernel to this new kernel source file by running the following commands

#cd linux-2.4.x.xx



# make mrproper


#make oldconfig




Step 3--Now patch the kernel with openmosix kernel

patch by running the following commands

Advertisment

#gunzip openmosix2.x.xx



# patch -p1 openMosix2.x.xx-x







Step 4--The next step is to compile the kernel with

the patch and all the old configurations from the Openfiler kernel. For that run

the following commands

make mrproper



make oldconfig


make xconfig


make dep


make clean


make bzImage


make modules


make modules_install


make install








For creating new volumes with Openfiler, click on the 'Create' link and provide the volume details 

And this should complete the installation part. Now you

have to reboot the machine. While rebooting, you will see a new entry in your

grub menu. Select this menu and boot the machine normally.

Now your one node is ready with the OpenFiler which has a

openmosix kernel patch. the next thing you have to do is to install the usermode

and openmosixview tools on top of this machine and you will be able to add new

nodes as well monitor your Open Filer Cluster. 

To do so, download the userland tools from http://prdownloads.sourceforge.net/openmosix/openmosix-tools-0.2.4-1.i386.rpm

and downlaod openmosixview from http://www. openmosixview.com/download/openmosixview-1.5-redhat90.i386.rpm



And install them with the following commands

#rpm —ivh

openmosixview-1.5-redhat90.i386.rpm    



#rpm-ivh openmosix-tools-0.2.4-1.i386.rpm

Installing other nodes



Now your first node of the cluster is ready. For adding new nodes, you have

two ways. Either use the openmosix live CDs and boot the other cluster nodes

using that. Or you can install any flavor of Linux which have the same kernel

version which the OpenFiler node has and install the openmosix-kernel patch of

the same kernel rpm on top of that.

Monitor the cluster 



To monitor your cluster, start X-Window on any cluster node that has the

openmosixview rpm installed, and run the following command:

#openmosixview



This will pop-up a window that will show a list of all the

servers in the cluster, the amount of RAM and processor utilization in each.

From here you can configure the load-balancing efficiency of the cluster. Each

node has a slider bar that you can move to do the load balancing. When you do

this, openMosix will automatically adjust the processes running on each machine.

It will automatically migrate them to another node if the existing node is

overloaded. You can also migrate the processes manually from one node

(processor) to another by selecting the processes option from the toolbar. It

will show you a list of the processes that are running. By double-clicking on

any process you'll get another window in which you can choose the node you

want to migrate the process to.

Only if your Windows 2003 server is a domain controller, can you enable the option for installing a real node 

Using Openfiler



Now your openfiler cluster is ready to manage your storage network. To start

working with it, fire up any Web browser and open the link

https://yourmachinename:445, where 'yourmachinename' stands for the host

name of the machine where openfiler is running. Remember to use https as

openfiler works over a secured SSL connection. On the first screen it will ask

you for the username and password. Use 'openfiler' as the username and

'password' as password. After logging into it you will find a neat and easy

to use storage-management workplace. From here you can manage the existing

volumes and create new volumes. You can also manage users and can set disk quota

for these users. These users can be of any form, such as LDAP, CIFS, Kerprose

and



NIS




.

In open source and Linux world there used to be quite a few

Clustering solutions available and that too of different types. Such as Load

balancing and HPC (parallel processing) clusters. Load balancing clusters are

those which can run any batch operation over a cluster infrastructure and

don't need any special modification in the software while HPC are the clusters

which need customized applications that are aware of MPI Libraries/APIs and then

can run accordingly.

MS Compute Cluster



Microsoft has released its first commodity hardware based computer cluster

environment. The softwares are still in their beta stage. So, we decided to give

it an early look. This is mainly an HPC (parallel processing) type cluster

network and needs the applications to be aware of the MPI API. 

The MS Compute Cluster Pack provides support for MPI2

libraries. It also contains an integrated job scheduler and the cluster resource

management tools. MPI is a standard application programming interface (API) and

specification for message passing. It has been designed specifically for

high-performance computing scenarios executed on large computer systems or on

clustered commodity computers.

It uses the MS MPI which is an MPI version of the Argonne

National Labs Open Source MPI2 implementation that is widely used by existing

HPC clusters. MS MPI is compatible with the MPICH2 Reference Implementation and

other MPI implementations and supports a full-featured API with more than 160

function calls.

The MS Visual Studio 2005 also includes support for

developing HPC applications such as parallel compiling. And this I think is the

best part that will turn out to be a great plus point for Microsoft's compute

cluster initiative. The reason being that, now a developer will get a familiar

interface for developing HPC programs and subsequently deploy and run them on

familiar environments.  

Requirements



All you require is a two CD set of Microsoft Compute Cluster 2003 which

includes a specialized Windows 2003 x64 version called MS Windows 2003 Compute

Cluster Edition and the Compute Cluster Pack.

Windows Server 2003 Compute Cluster Edition is a

specialized 64-bit Windows Server OS based on the 64-bit edition of Windows

Server 2003 to support high-performance software. Windows Server 2003 Compute

Cluster Edition is a full version of the Windows Server 2003 64-bit operating

system. However, it is not intended to be used as a general purpose server.

You can also install the Compute Cluster Pack on top of any

Windows 2003 Server edition for a 64-bit architecture.

Installation



The installation of MS CC consists of two parts. Creating a Head node and

the Compute nodes. The Head Node acts as a server and all the management is done

from this node.It is very easy to install. All you have to do is to take a

machine which has a 64-bit processor and then install MS Windows 2003 CC edition

on it. Subsequently, you can either make it a domain controller by running

dcpromo.exe or add it into an existing Domain. After this is done, install the

MS CC Pack on it. After you start the installation of the Compute Cluster Pack,

it will ask you whether you want to install the components for the Head node,

Compute node or Administrative nodes. Insert the Compute Cluster Pack CD and let

the autorun start. Now when the wizard starts, select the first option which

says 'Create a New Compute Cluster with this node as a Head node' and

proceed with the installation wizard.

This wizard will install around seven components out of

which three will be installed from the CD and the rest of the four from the

Internet (first download and then install). It will take around 15 to 20 minutes

to complete the installation. When the installation is over, you can go to StartàProgramsàMicrosoft

Compute Cluster Pack and click on Microsoft Compute Cluster Pack. Here, you will

see a window where the total number of processors available in the cluster and

the number of jobs running are shown. But at this time as there are no nodes

connected to the cluster, the software will only see the resources available on

the Head node.

In this window you can see the summary of your cluster like number of CPUs, total jobs, etc

Installing Compute Nodes



After you are finished with the creation of the Head node, the next thing

you have to do is to install and add the Compute nodes. For this, you will need

another set of 64-bit machines (numbers will depend on how big you want to build

your cluster; for our test we created a two node cluster) having Windows 2003

Server Edition or Cluster Edition installed on all of its machines. 

Now you have to install the Compute Cluster Pack on all the

machines. For this, you can either use an RIS server or do it manually. We

created a small two node cluster, and so,  we preferred to install it

manually. For this, place the CC Pack in the CD drive of the node machine and

let autorun start. When the wizard window pops up, select the second option

which says 'Join this cluster to an existing Compute Cluster as a Compute

node.' Selecting this option would enable the 'Enter the name of the Compute

Cluster Head node' option. Here, enter the fully qualified domain name of the

Head node and follow the wizard to complete the installation.

Adding the Compute Nodes



After the installation is over, you have to add the nodes to the cluster. To

achieve this, go to the Head node and open the Cluster Management console. Now,

from the Node Management pane, click on the Add Nodes button. Also, on the

option 'Before You Begin Page,' click Next. On the Select Add Node Method

page, select manual addition. This will now ask you for the FQDN of the node

machines. After you provide the necessary details it will ask for the

Administrator password and once you are done with the nodes will be installed.

The node you have added will be displayed in the Node

Management page as 'Pending for Approval.' Approve or reject the node for

inclusion to the cluster by right-clicking that node and choosing Accept.

This month we have seen how to build a Commodity Cluster

using MS Compute Cluster Pack. Next month, we will see how to build an MPI

compliant application using Visual Studio and run it on the cluster. 

Anindya Roy

Advertisment