Tech Explained

Pump up the Performance

PCQ Bureau

09 Oct 2006 18:05 IST

New Update

High performance computing (HPC) is invariably linked to Rocket Sciences.

Normally, we think that this technology is used only for fields such as weather

forecasting, genome mapping, simulating chemical reactions, etc. But that's

not the complete picture. Well, not taking the argument too far, just pick up

our last six or seven issues where we have talked about half a dozen different

ways to use HPC in an Enterprise. Today HPC is no longer meant for research labs

and universities only, but it has become a key enabler for business

applications.

Advertisment

As per top500.org's latest report, 51% of

the top 500 supercomputers worlwide are being used across different

industry verticals

Still not convinced? Let's do some number crunching. There is a website

called www.top500. org which is responsible for deciding and ranking top 500

supercomputers across the globe. It conducts a survey every six months. As per

its latest report, amongst the top 500 supercomputers across the globe, 51% have

been used across different verticals, while the combined aggregate of those used

in academics and research comes to around 41%. The point we are trying to make

here is that with declining hardware and software costs, and with a choice of

more than one technology, setup, management and application of HPC have become

easier. It is gradually entering each and every vertical. Be it banking or

finance, image processing/rendering or gaming and entertainment, automobiles or

medical sciences, HPC is providing the much needed competitive edge to

enterprises to do more in lesser time. In this article we take you through some

of the latest technologies in this field.

HPC architectures

To begin with, let's talk about the HPC technology architecture. The most
common architectures used today are MPP and Clusters.

Advertisment

MPP: MPP or massively parallel processing, is pretty much similar to

SMP or Symmetric Processing, that we see today in a normal multiprocessor

server. Even the hyperthreading processors are also a form of SMP. In both cases

we have tightly bound processing units. This means the interconnect is more

sophisticated and in most cases it's internal. The difference between the two

is that in SMP systems all CPUs share the same memory while in MPP systems, each

CPU has its own memory. This implies that the application must be divided in

such a way that all executing segments can communicate with each other. Hence,

MPP systems are difficult to program. But because of this architecture, MPP

systems don't have the bottleneck problems as are present in SMP systems, where

all the CPUs attempt to access the same memory at the same time.

Today, MPP is used in the core of a majority of high-end supercomputers. But
there's a catch. You can't build an MPP-based HPC with a commodity PC. For

doing so, you need to go to vendors such as IBM or Cray. And because of this,

the cost involved is pretty high. But the benefits are in the reduction of

bottlenecks caused by the interconnect and a less complex architecture in terms

of manageability.

HPC

solution providers

Cray : www.cray.com

IBM : www.ibm.com/servers/deepcomputing

Intel : www.intel.com/go/hpc

IS : www.interactivesupercomputing.com

NEC : www.hpce.nec.com

SGI : www.sgi.com/products/servers

SUN : www.sun.com/servers/hpc/index.jsp

Clustering: The other technology is clustering, or rather high

performance clustering. The major difference between MPP and Clustering is that

in clustering we have loosely bound processing units which are referred to as

nodes, and the interconnect is mostly external, such as a standard high speed

LAN, a Myrinet or an InfiniBand. We will discuss these interconnect technologies

later. A good thing about such an HPC is that it can be built on commodity

hardware and networking equipment, which brings down the cost. There are plenty

of software, applications and middleware available to build such an HPC.

Clustered HPCs are divided into SSI and PVM based Clusters. We have discussed

them quite extensively in our previous articles. Here's a quick recap.

Advertisment

1. SSI based clusters: Single System Image (SSI) is a clustering

technology that can make the nodes on a network work like a single virtual

machine with multiple processors. The best thing about SSI is that for running

on the new virtual machine, it doesn't require any modification to your

application. However, because of this, there are certain drawbacks as well. SSI

works very well when you run many tasks simultaneously on the virtual machine,

for instance, converting hundreds of media files from one format to another. In

such a situation, the SSI cluster will migrate all tasks evenly to all machines

available in the cluster and complete the job significantly faster than if it

were on a single machine.

On the other hand, if you deploy a single job or thread which requires a

large amount of number crunching, the SSI cluster will not give you any

performance improvement. This is because it can not divide a single task into

multiple threads and spread them across the nodes of the cluster. One example of

clustering middleware for SSI is OpenMosix. The charm of SSI based clusters is

that you can deploy any standard (Linux) application on the cluster, without any

modification to the application. So, for enterprises that want to migrate an

existing application (mainly batch processing applications) on to a cluster, but

at the same time don't want to invest in re-creating their applications with the

PVM/MPI support or are using a third-party application where they don't have

access to the code, SSI based clusters are the best solution.

2. PVM clusters: Parallel Virtual Machine (PVM) is the other

clustering technology. It's different from SSI as here you need to recompile

or build the application which you want to run on this cluster with PVM/MPI

support. This means that you cannot run any existing application, without

modification, on this cluster.

Advertisment

The commonly used clustering middleware is OSCAR. A major benefit of using

such a cluster is that if you are running a single application which needs huge

number crunching capability on a PVM cluster, then the same application will

automatically take care of thread management and job migration between nodes.

What would you use a PVM cluster for? Scientific applications for one are

best suited for PVM clusters. If you want to build a cluster which can do genome

mapping, for example, then PVM is the best choice. Similarly, data modeling and

forecasting jobs are also best run on such a cluster.

HPC at work
We tested an SSI framework based cluster. For this, we built an 18-node OpenMosix cluster and compared it against a standard dual Xeon 2.4 GHz processor-based server with 1GB RAM. The cost of this server was nearly equal to the cost of our cluster. We compared both using two different tests. The results we got were really exciting.

Here, the server is fully loaded, while the cluster is under 10% load only (simultaneously converting 75 WAV files to OGG). It took about 50% less time on the cluster	The cluster and server both are fully loaded with the same load (zipping and taring 55 MB files in batches of 100, 150 and so on upto 300). The cluster gave 6 times better performance

Advertisment

Interconnects

After architecture, the next most important thing in an HPC is the interconnect.
Generally, if you choose an MPP based architecture then you don't need to

bother about the interconnect, as it's already there in the system. But if you're

going for a cluster based approach then you have to decide about the right

interconnect to use. In the following portion, we discuss some of the key

technologies involved in loosely attaching interconnects.

Myrinet: Myrinet is a high-speed LAN system, designed by Myricom. It is
designed to be used as an interconnect amongst multiple machines, to form

computer clusters. One of the benefits of using Myrinet is that it has much less

protocol overhead than standard interconnects such as Ethernet. As a result, it

provides better throughput, less interference and latency. This is also one of

the most popular interconnect techniques for clusters.

A standard Myrinet consists of two fiber optic cables (one for upstream and the
other for downstream) per node, switches and a router with low overhead. A

fourth generation Myrinet can give a speed of 10 Gbps. But this is not the only

reason for its popularity. The other benefit that you can get is very low

latency when compared to a normal LAN. And this low latency is achieved by a

technique in which the application that is running on the cluster is aware of

the NIC's firmware and can bypass the OS by sending messages directly to the

network. Some other key features of Myrinet are heartbeat, flow control and

error control in each link.

InfiniBand: InfiniBand is a point-to-point bi-directional serial link

used for connection of processors with high speed peripherals such as disks. It

supports several signaling rates. Initially, InfiniBand technology was used for

connecting servers with remote storage and networking devices, and other

servers. But later it was to be used inside servers for inter-processor

communication (IPC) in parallel clusters. The serial connection's signaling rate

is 2.5 Gbit/s in each direction per connection. InfiniBand supports double and

quad data speeds-5 and 10 Gbit/s respectively.

Links can also be aggregated in units of 4 or 12, called 4x or 12x. A

quad-rate 12x link can carry 120 Gbit/s raw or 96 Gbit/s of useful data. Other

benefits include greater performance, lower latency, easier and faster sharing

of data, built-in security and quality of service, improved usability (the new

form factor will be far easier to add/remove/upgrade than today's shared-bus I/O

cards). But again this is not a commodity product and to deploy such a setup you

need to hire specialists.

Advertisment

Gigabit LAN : Now, this technology is known to everyone. Yes, it is

the standard Gigabit Ethernet connection which is used in standard LANs. It is

also used as a cluster interconnect. The devices that will be required for such

kind of topology are standard Gigabit switches/routers and CAT5 enhanced UTP

cables.

Being a technology that can work with a commodity product, it is one of the

most common interconnect for small or mid-sized HPC systems. The cost of

deploying such an interconnect is very low and it can actually work on your

existing infrastructure with minimal or no modification.

But as compared to other inter-connects, it has drawbacks such as a relatively
high latency and a lack of QoS or HA built into the hardware.

Final verdict

Broadly speaking, we have two options before going for an HPC deployment. It
could either be a specialized deployment or it can be made up of commodity

hardware, software and interconnects. Now the decision is completely yours. And

it depends on the type of work you want to do.

If you want to run common applications on top of a cluster, an SSI based
commodity cluster will be fine for you. In case you have a substantial amount of

unutilized processing power on your network, then also a commodity cluster will

do.

But if you need to run some specially designed apps (most likely a single job

which requires a huge amount of processing power) with hardware level Failsafe

and rapid scalability, and in case you don't have the in-house expertise, then

you should approach a vendor to do the deployment for you.

Setting up a commodity cluster
How much does it cost to set up a high performance cluster? The answer depends on the number of nodes you want to deploy. Here is what it cost us to deploy a 20 node cluster:
Item	Configuration	Number	Unit Cost (Rs)	Total
Nodes	P4, 2.4GHz, 40 GB HDD 256MB RAM and CD Drive	20	12,000	240,000
Switch	24 port, gigabit	1	25,000	25,000
Monitors	14” color	1	4,500	4,500
Keyboard	101 Standard	2	200	400
			Sub Total	269,900
Option I	Low Cost
	Angel Rack	2	2500	5000
	Power strips - 15 amp	5	150	750
	Ethernet cabling - 50 m	1	500	500
			Sub Total	6250
Option II	High Cost
	Server Racks Installed	2	30000	60000
The cost does not include cooling and power solutions. Also, depending on the make of the rack used, your costs could go up by another half a lakh or so for this setup. One monitor is always connected to the cluster manager machine while the other one is used for troubleshooting. To keep costs down, we did not use a KVM switch. What we did instead was to use Rdesktop and SSH on Linux (Rdesktop for Linux to Windows and SSH for Linux to Linux) for desktop sharing. We used the Remote desktop client on Windows for Windows to Windows and Putty for Windows to Linux management. Doing away with the KVM switch, however, caused a few trips to the cluster to physically connect the monitor and keyboard for troubleshooting.

Advertisment