Advertisment

Running Linpack on Win Compute Cluster

author-image
PCQ Bureau
New Update

We first talked about Microsoft's Compute Cluster Suite in April last year.

Lots of things have changed since then in the world of HPC, and likewise, the

Cluster Suite has also undergone many changes. At that time, we had just three

64-bit machines in our labs and we used all of them to create the MS Compute

Cluster. The interface was so difficult that we were not even able to create a

test MPI (Message passing interface) job and submit to the cluster properly. But

of course that was the first public beta of MS CCS and it was a bit too much to

expect full blown functionality from it.

Advertisment

Today we have the new Compute Cluster Suite SP1, and also have twenty 64-bit

machines at our disposal. That's why this time, we decided to build a much

bigger cluster of 15 nodes, with each node having a dual core CPU (plus one head

node) using the MSCCS SP1 and then test it with some standard industry

benchmarks.

We'll first talk about how to build such a cluster, and then we'll discuss

how to port High Performance Linpack(HPL) to Windows and then finally run it on

all the nodes in a distributed manner to see what kind of performance it's able

to deliver.

Linpack is a benchmark that measures floating point operations (FLOPS) and

comes in different variants. One such variant of Linpack is HPL or High

Performance Linpack. It is an industry standard benchmark for measuring

performance of supercomputers and has been used by top500.org for benchmarking

world's best 500 supercomputers.

Advertisment
While installing Microsoft

Compute Cluster Pack, you will see this screen. Select the first option to

use a node as the Head node

The Setup



MS CCS doesn't work on 32-bit architecture, but you can install either the

Head or the Client node on a 32-bit machine. The 15 nodes that we used for

setting up MSCCS had an Intel Core 2 Duo 1.8 GHz processor and 512 MB RAM. For

the head node, we took a Dual Xeon processor machine with one GB RAM. The 15

nodes were meant to process the computing jobs, whereas the Head node managed

jobs and the whole cluster.

To interconnect the cluster we've used a Gigabit Ethernet network. All nodes

of course, had Gigabit Ethernet cards, and were PXE boot enabled. These PXE

enabled cards are used for installing an OS remotely and come in handy while

using the Windows Remote Deployment server to do a bulk installation of OSes on

multiple machines. All nodes were head less and connected to an IP KVM for

centralized management.

Advertisment
Wanna be part of this series ?

What we plan is to do a comparative shootout of

commodity clustering architecture in the series, where we will benchmark all

the clustering solutions one after the other each month on similar set of

physical hardware. At the end of the series we will compile a full-fledged

summary: which is the best clustering solution and will cover all the

aspects such as usability, support, performance, cost, etc. But that's a

long way to go and right now we can't be sure where will this story head,

because of the requirement of huge amount of resources , time and skilled

manpower. So, we are looking out for contributors who can help us in this

story. For more discussion on the topic we've already started a post on our

forums. You can also join in and discuss at



http://forums.pcquest.com/forum/viewtopic.php?t=6148
.

Installing the Head Node



The first thing you need of course is a copy of MS Windows Server 2003

Compute Cluster Edition, and you can download a 180 days trial version of the

same from http://tinyurl.com/3ysqz5. For this download to be successful you will

require a Microsoft .Net Passport.

Install it on the machine you want to use as the Head node. The same OS can

be used for creating the Head node as well as the Compute nodes. After

installing the Head node, create an isolated domain for the cluster.

Advertisment

If you still have another domain controller on this network, then you can

create the head node as an additional domain controller. We created an isolated

domain controller for our setup. For this we ran the dcpromo command and

followed the dcpromo wizard. Just make sure that while creating the domain you

also install and create a local DNS server on the Head node. This will help you

when you deploy MS CCS.

Now, install the DHCP server on this machine so that the remote deployment

server can work properly. (Configuration of DHCP server is out of the scope of

this article and we are assuming that you have the basic knowhow of how to

configure basic services like DHCP on a Windows Server 2003). One word of

caution- if you are planning to provide an Internet connection to your cluster

(which is a good idea as you will get regular updates and downloads easily),

then configure it using Windows Internet Connection Sharing and not with Remote

Access Server (RAS). I am not sure about the reason but MSCCS recommends ICS

instead of RAS, and we also had troubles while we tried to run it with RAS.

As you are done with the configuration of all necessary services i.e. ADS,

DNS, DHCP and ICS, download the latest x64 version of Compute Cluster Pack SP1

from http://tinyurl.com/2rjwt4.

Advertisment

When installation starts the wizard pops up, which is pretty much self

explanatory. All you have to do is to select the 'Create a new compute cluster'

option. Follow this wizard to install all the required components to make the

machine a Head Node.

Go to Program Files and you will find a new menu where you will see two

applications: Cluster Job manager (used for submitting and managing cluster

jobs) and Cluster Administration (used to configure cluster and cluster nodes).

This is the 'admin' window of

Compute Cluster Pack. All the installation and management tasks happen from

this single interface
Advertisment

Configuring the Cluster



This process involves three major tasks:

  1. Configuring the network topology
  2. Installing and adding nodes
  3. User management
Advertisment

Since, ours is a test cluster we won't give much emphasis to the users

management part, rather we will focus on configuring the network and nodes now.

Configure the Network



To configure the cluster, go to Program files>Microsoft Compute Cluster

Pack> and start the Compute Cluster Administrator. Under the 'To do List' pane,

select the 'Configure Compute Cluster Topology' option.



This will open up the wizard. From the drop down menu select 'Compute Nodes
Isolated on Private Network' and proceed to the next step.

Further the wizard will ask you to select the network cards which are

connected to the public and private network one by one. Select the right option

and then click on Finish. After this, disable the firewall, which is recommended

considering the fact that ours is a test setup.

For this, click on the 'Manage Windows Firewall Settings' option, which will

open up the standard Firewall manager window and disable the firewall. Remember,

if you are building it on a production network then chose your security policy

options accordingly.

This is the place where you

configure network topology of your cluster. The next likely option will be

'Compute node loaded'

Installing Nodes



Click on the 'Install RIS' link and install Remote Deployment Server. Then

click on the 'Manage Image' option. This will open up a wizard. In the next

step, select 'Add a new Image' option and click on Finish. This will start the

standard RIS wizard and will then ask for the folder where it will create the

RIS root directory.

Make sure that for this folder you select a partition other than the system

partition; else you won't be able to install Windows 2003. Provide a name to the

folder such as RemoteInstall and then proceed. Further the wizard will ask you

about the location of the CD, whose Image you have to create for remote

installation. Place the Windows Server 2003 Compute Cluster edition CD in the CD

drive of the Head node and specify the drive letter in this wizard. Click on

'Next' and proceed till the wizard gets completed and the image building process

starts. This process will take around 10 to 15 minutes for completion.

Once it is done, your RIS is ready and now you can turn on and boot all your

Compute nodes over the network to start an un-attended remote installation. This

process is quite simple, so we won't discuss its details.

In the 'Compute Cluster

Administrator' window, you can check the status of the nodes. To check the

exact resource utilization of any node, use the System Monitor option

Adding Nodes



Till now, only the OS has been installed on the Compute nodes. To make the

whole setup work properly, you have to install a few more components. For this,

go to each node one by one, uncheck 'create the machine a Head node' option and

run the Compute Cluster Pack on them. This will install all the required

components, though in some cases it might also require to download some upgrades

etc from Internet during installation. So make sure that you have the connection

handy if required.

Once this is done, you can now add nodes to the Head node. For this first

join all nodes to the Cluster domain and reboot them. Now go to the Head node

and open Compute Cluster Administrator. From the 'To do list' select the Add

Node option, which will open up a wizard. It will ask the kind of employment

that you want,



select 'Manual Deployment' and then click on 'Next'. In the next step, type in
the FQDN of all the nodes one by one and add them by using the Add button. Then

close the wizard by clicking on Finish. The FQDN will be something like Node00x,

where x is the number of the node.

In MSCCS you can execute a task

directly through the command prompt by running the mpiexec command. To

submit a task you've to go through the Task Properties window

Porting Linpack for Windows Compute Cluster



Here, we will see how one can port (re-compile) Linpack source in Windows

and then run it on Windows Compute Cluster Environment. We tried it and used it

for benchmarking our created Microsoft Compute Cluster. But we faced a problem.

Basically, Linpack is an application used majorly for testing Linux based

clusters and trust me, porting it to run on MSCCS was not at all a child's play.

In this article, we will see how with the help of some tools and libraries,

you can recompile the HPL source files in your Windows architecture and run it

on the top of MSCCS.

Prerequisites



The list of prerequisite SDKs and libraries is too long, but the first thing

that you need is MS Visual Studio 2005. Install it on any of the nodes of your

MS Compute Cluster. The compiler is to be installed on one of the nodes because

it ensures that you are compiling your application on the right hardware

architecture and as a result you'll get better performance.

After this, download both the AMD and Intel's Math Kernel libraries. Download

and install the file called 'acml3.5.0-64-win64' from http://tinyurl.com/

2k6tny. Also download and install the Intel's Math Kernel library use the

following link: http://tinyurl.com/2p9m8f.

Now install the MS Compute Cluster SDK from http://tinyurl.com/3yjyg9. Just

make sure that you download and install the 64-bit version. Now the installation

is done but for Linpack to work properly you'll have to perform some nasty

tricks. This is because the makefile that we are going to use for compiling

Linpack had a lots of path names hardcoded.

To begin with, first create a folder called “scratch” at C:\ of the node

where you have installed all the above mentioned components. Then go to the

folders where you have installed ACML and MKL.

By default they will be in the Program Files folder if you did not give any

other path. Go to the AMD folder first and rename the ACML3.5.X file as

ACML3.0.0. Similarly, go to the Intel's folder and rename 9.1.x as 8.0.1. So,

the hacking part is done and we are ready to work on the actual file.

Applications built using Visual

Studio with Manifest option enabled, can't be run using MS CCS. Therefore,

disable that option before you compile Linpack

Compiling Linpack



Now download the latest version of HPL from http://tinyurl.com/2mopw8. Unzip

it in a way that the HPL folder comes under the C:\scratch folder.

In Linux, Linpack uses the make command for compilation. But the makefiles

are generally created for different Linux distros and not for the Windows. So,

now you have to grab a makefile for Windows. To make our task easier, if we also

get a .vcproj file for Linpack then we can use it to compile Linpack directly on

VS 2005. You can download all the required components from our forum. The link

for the same is http://forums.pcquest.com/forum/viewtopic.php?t=6154&highlight=.

Go to this link and download the xphl_port.zip file. Unzip it under the

C:\scratch\hpl folder and copy the HPL_timer_walltime.c to the C:\scratch\hpl\testing\timer

folder. There will be a file with the same name already sitting in that folder,

so while copying replace the old one with the new one.

Double click on the xhpl.vcproj file and open it as a VC++ project in VS

2005. You have to build the project but before that one more thing is required.

The VS 2005 while compiling an exe embeds the manifest file inside the exe,

which is not recognized by our mpiexec command that finally you have to use for

running Linpack. So, you have to tell VS2005 not to embed the manifest file

while compiling. To do so, go to the Property page of the xhpl project and click

on Manifest Tool> input output and change the value of 'Embed Manifest' from Yes

to No. Now close this window and go to the Build menu, and click on the Build

Project option to compile Linpack. The exe will be created in the C:\scratch\hpl\bin\

64\xhpl.exe.

Once you have submitted the job,

you can then view the status of the job under the Job Monitor window

Running XHPL



To run XHPL you have to use the Compute Cluster Job Manager. For this go to

Program Files> Microsoft Compute Cluster Pack. Then, go to the File> Submit job

Menu. This will open up a window. Here provide a descriptive job name and go to

the Processors tab. Then select the number of processors that you want to use

from your cluster to process your job. Remember, the number you provide should

be equal to the number of cores and not the number of physical processors.

Now go to the Tasks tab and in the Command line field, type in the command

you want to run. If it's an MPI process that you are going to run (which Linpack

is) then the command will be something like 'mpiexec xhpl.exe'. To add tasks

click on the Add button. Tasks that have been added will get listed under the

task list.

Select the task and click on the Edit button. Here, provide the working

directory, and input and output file name. The working directory is essentially

the shared location where the xhpl.exe sits and it should look something like

Error! Hyperlink reference not valid.

The output file can be any file where you want to get the output of Linpack.

By default it is hpl.out. The input file is of course the HPL.dat file. Provide

these values and submit the task to get executed.

This will start the xhpl process on all the nodes. But if it fails then you

have to modify the hpl.dat file in the bin folder. This is the file where you

set all runtime settings for xhpl and from here you can also tune XHPL for

performance. Tuning XHPL is a tedious job and it is not possible for me to cover

it in these two pages.

While writing this article, I am still trying to figure out how to get the

best performance out of our cluster by tuning XHPL. So far, I have achieved some

46 GFlops, but there is still a long way to go. So, when I am done with this

tuning, next month I will talk about how to tune XHPL in detail. Till then you

can refer to the article hosted at http://tinyurl.com/23q98y.

Advertisment