High performance computing can be achieved using a cluster of machines instead of a single powerful machine. Cluster install is not for the newbie. This article assumes more than a passing familiarity with Linux. For an in-depth understanding of clusters, the concept of cluster server and nodes, high performance computing and the OSCAR package refer to the May 2002 issue of PCQuest.
PCQLinux 8.0 uses OSCAR to set up a high performance cluster. Oscar simplifies the setting up of a cluster and includes utilities and libraries for cluster control and high performance cluster programming. The supercomputer that we are about to build consists of a single server and a number of nodes. We can deploy the programs on the server, which in turn distributes the processing (of the programs) amongst the nodes in the cluster.
The server will require about 2 GB of hard disk space while the nodes require atleast 1 GB. Plug in the server and nodes on a separate network, using a hub, or preferably a switch. Boot the server off PCQLinux 8.0 CD1. When the ”Install Type” screen is shown, select the supercomputer option. Subsequently, when the network configuration screen is shown, make sure you specify a hostname manually (for example server.cluster.net). Assign an IP address 192.168.2.1(say) and corresponding netmask (255.255.255.0). By default, the supercomputing install type installs the only the required packages. However, you can select additional packages to install on the server from the package selection screen, if needed.
Preparing to run OSCAR
Once PCQLinux 8.0 is installed on the server, start X Window and launch a terminal window (select System Tools>Terminal from the start menu). All subsequent commands must be issued in this terminal window.
In the terminal window, change to the directory /opt/oscar-2.1 and issue the following command:
OSCAR requires PCQLinux RPMS to be present in a directory called /tftpboot/rpm. Insert and mount PCQLinux CD1 and issue the following command:
and follow the onscreen instructions. Next, insert and mount CD2 and again run the copyrpms script as above. The copyrpms script creates a directory /tftpboot/rpm and copies the required (by OSCAR) PCQLinux RPMS from the CDs to this directory.
Cluster setup Issue the command:
After a couple of minutes, the OSCAR graphical wizard will show up.
Click on the button “Install OSCAR Server Packages”. As shown in the screenshot, the wizard accomplishes the cluster setup in steps. After each step a message pops up, reporting the successes (or failure) of the step.
Skip step 1 and 2 and click on “Install OSCAR Server Packages”. Next, click on the button “Build OSCAR Client Image”.
In case you have SCSI hard disks on the nodes, click on the choose partition file and select the file named sample.disk.scsi.
Click on “Build Image” button.
Click on the button ‘Define OSCAR clients’. For “Number of hosts”, type in the number of nodes - that you want to plug into the cluster. Subsequently click on the ‘Add clients’ button.
Set up the nodes
Next, Click on the button ‘Set up Networking’. In the right frame you will see a tree-like structure. We need to assign the MAC (Media Access Control) address of the nodes to the listed IP addresses. This can be done by booting the nodes using an autoinstall floppy. To create the floppy, click on the button ‘Build AutoInstall Floppy’. This will launch a new terminal window. Press enter and insert a blank floppy in the server and click ‘y’ to continue. After the terminal window disappears, click on the button ‘Collect MAC addresses’ in the OSCAR window. Insert the floppy in one of the node machines and power it on. The machine will boot from the floppy. Press enter at the boot: prompt. After some time, the MAC address of the node will show up in the left frame. Suppose we want to assign the IP address 192.168.2.2 to this node. Click on the MAC address in the left and on the ‘osacrnde1.cluster.net’ in the right frame. Then, click on ‘Assign MAC to node’.
Switch off the node machine. Now boot the second node machine from the same floppy. As before, the MAC address of the second node will appear in the left frame. Assign it to oscarnode2.cluster.net.
Repeat the above process for other nodes. When done, click on the button ‘Stop collecting’ on the OSCAR window.
After shutting down all the node machines, click on the button ‘Configure DHCP Server’. Then click on the close button in the ‘MAC address collection’ window.
Starting node install
Important NOTE: The following step will wipe out any existing data on the hard disk of the node.
Boot the first node machine again from the floppy. The node will now install PCQLinux 8.0 from the network. When done, a message,
I have done for … seconds. Reboot me already will be shown.
Take out the floppy and reboot the node machine. This time it should boot from the hard disk. If everything has gone well, you will boot into PCOLinux 8.0. Repeat the process for other nodes.
Click on ‘Complete Cluster Setup’ on the server and then on ‘Test cluster Setup’. All tests except for the MPICH (via pbs) should succeed (refer to the box: MPICH is not supported).
Adding and deleting nodes
To add a new node later on, launch the OSCAR wizard as:
from the terminal window. Click on “Add OSCAR Clients”. Click on “Define OSCAR Clients” (on the window that pops up). Here modify the starting number to one more than the maximum number of nodes on the cluster. That is if your cluster already has 10 nodes, type in 11 for the starting number. Similarly modify the starting IP address. For the “Number of hosts”, type in the number of new nodes. Click on “Add Clients”. The “Setup Networking” step is same as explained above. Finally click on the “Complete Cluster Setup” button.
To delete one or more nodes, click on “Delete OSCAR Clients”. In the pop-up window, select one of the nodes to delete and click on “Delete Clients”. Repeat the process to delete more nodes.
Henceforth, using the libraries installed on the cluster, you can start developing or executing cluster-aware applications on the server. To get started, OSCAR installs PVM (Parallel Virtual Machine), PBS(Portable Batch System), Maui PBS Scheduler, LAM Message Passing Interface and C3 (Cluster Command and Control). For more information refer to the URL http://oscar.sourceforge.net.