Advertisment

Supercomputing With Linux

author-image
PCQ Bureau
New Update

As conventional computing races towards the physical

limits of hardware–the speed of light, the uncertainty principle, unpredictable

quantum effects–researchers all over the world are working on methods that will let

them continue increasing available computing power. Moore’s Law, the famous assertion

that computing power available at a fixed price-point doubles every eighteen months, will

not bear violation.

Advertisment

Though DNA, optical and quantum computers are under various

stages of development, and very exciting breakthroughs are being made almost every day, we

are not likely to find these highly experimental computational techniques powering the

machines on our desktops any time soon. But a more prosaic approach, co In October 1996, a Beowulf system hit the 1 gigaflop mark

on a space science problem. Recently a 16-node Beowulf system, at a total system cost of

under $50,000 (the price of a high-end workstation) exceeded 1.25 gigaflop sustained

performance on a gravitational N-body simulation of 10 million particles–a

price-performance breakthrough of significant importance for a wide range of industrial

applications, including aerospace.

Beowulf’s foremost design objective was to deliver the

highest parallel processing performance for the lowest price. That explains the designers’

choice of mass market commodity components, dedicated processors (rather than scavenging

cycles from other workstations, like some other systems do), and a high-throughput System

Area Network (SAN)–anything from fast Ethernet or multiple Ethernets combined into a

single virtual network, all the way to a 64-node, 128-processor hypercube with 8-node

clusters connected internally through fast Ethernet, and the meta-nodes connected through

a 1.2 gigabit Myrinet crossbar.

Donald Becker, one of the developers of Beowulf, is

responsible for many of the network interface drivers available for Linux, especially for

high-throughput devices. The project has led to the development of much high-performance

networking code that has now been incorporated into the Linux source tree.

Advertisment

The Beowulf software environment is called

Grendel. It’s

implemented as a set of add-on programs to the RedHat Linux distribution, and includes

several independently developed parallel programming environments, providing message

passing interfaces with the PVM (Parallel Virtual Machine) and MPI (Message Passing

Interface) libraries. Beowulf also integrates the BSP (Bulk Synchronous Parallel)

libraries which reduce the need for explicit message passing code to be added to serial

applications and parallel applications developed to work on shared memory architectures–resulting

in faster porting and development times. System V IPC and p-threads are also supported.

With so many programming environments available, Beowulf is

probably one of the most flexible parallel processing systems available. This greatly aids

in porting applications from traditional parallel supercomputers. In the past, code was

often developed and debugged on an inexpensive Beowulf system and then transferred to a

large distributed-memory machine, like an Intel Paragon, a Cray T3D or a Thinking Machines

CM5. Recently, however, a duo of scientists ported their highly optimized N-body codes the

other way–from a CM5 to a Beowulf. The port was accomplished with a simple recompile.

In a Beowulf cluster, every machine is responsible for

running its own copy of the kernel, and nodes are generally autonomous at the kernel

level. However all the kernels participate in a number of global namespaces, such as the

Global Process ID space. All processes running on a Unix machine have a unique Process ID

or PID to identify them. In a parallel distributed processing system, you’d want to

have unique process IDs across an entire cluster, spanning several kernels. Beowulf

provides at least two mechanisms to support this: one internal, and one inherited from

PVM.

Advertisment

Linux uses the /proc pseudo-filesystem to present

system and process information in real time through a familiar ‘filesystem’

interface. A basic /proc filesystem presents a subdirectory for each process on the

local processor. The Linux implementation extends /proc to present almost all

system information in this format. Integrating /proc filesystems across a Beowulf

cluster would allow standard Unix tools for process management and job control to work

transparently on the virtual machine without the need for any modifications, because most

of these utilities get all of their information from the /proc

filesystem. Work is

underway on a unified /proc filesystem for Beowulf. The Linux kernel also provides a filesystem-like interface

into the virtual memory system. That makes it simpler to create page-based shared memory

systems that allow the entire memory of a cluster to be accessed transparently from any

node. Other environments being added to Beowulf to implement distributed shared memory

include the page-based NVM (Network Virtual Memory). Beowulf also supports parallel

filesystem libraries like PVFS (Parallel Virtual File System) and PPFS (Portable Parallel

File System) to allow high-performance disk I/O. NASA Goddard’s 16 Pentium Pro

cluster contains 100 GB in distributed disk with a whopping 1 Gbit per second memory to

disk bandwidth.

Beowulf stacks up comparably with machines that can cost a lot

more. A Beowulf has equaled the performance of IBM SP2s with comparable nodes at one-tenth

the price. Beowulf systems running on the latest generation of Intel or DEC Alpha chips

perform comparably with supercomputers like CDAC’s Param, Intel’s Paragon, and

the Cray T3D (which have similar, parallel internal architectures). For high-performance

distributed computing, you can’t go far wrong with Beowulf.

The American federal government has always been paranoid

about letting high-performance computing technology out of the country. But it’s

already losing its war against strong cryptography exports, and with Beowulf it’s no

longer a very big deal for anyone with the motive, the means and the skill to put together

a supercomputer that the CIA and the NSA will consider a national security threat. Even if

all you really want to do is set another record at the Oscars.

Next month: How you can play around with

multiprocessing and increase the computing throughput in your organization using PC

clusters.

Advertisment