Tech Explained

De-mystifying Grid Technologies

PCQ Bureau

03 Apr 2007 18:29 IST

New Update

Computation has changed drastically since the days of the first computer. In

the 60s and 70s, mainframes took charge of all processing and computation for

government, scientific and organizational needs. Thereafter, we saw the advent

of desktops or 'Micro Computers.' Almost parallelly, the concepts of networking

started to develop. And it didn't take long thereafter when grids and clusters

were implemented. In this article, we look into the concept of the computation

extremes achieved taking clusters a step further. Yes, we are talking about the

still in infancy yet very promising Computation Grid. Read on to find out what

it is, how it works, and most importantly which way it is heading.

Advertisment

What is a Grid?

Well its name and concept is derived from the electric power grid. To put it
shortly a grid is the way to share computational power and data storage over the

Internet. Just like the electric grid you don't have to worry where are you

receiving power from. Basically, the computational grid brings all the resources

under it into one entity. This collection of resources can then be used for high

end computation and with the storage of the participating systems combined,

provide an infinite but cheap storage option. While some might define it as a

'collection of clusters' or other definitions, we would like to stick to the

definition we gave a little while ago without giving any specific structural

example.

Now let us get down to a more elaborate definition. Grid computing can best

be defined as a form of distributed computing that works by sharing computing,

application, data, storage, or network resources across dynamic and

geographically dispersed organizations or computers. This is the reason we say

that a collection of clusters is not an appropriate definition. Clusters don't

work by bringing together systems or computers located geographically apart. We

will get down to differences between grids and clusters in detail a little

later.

Grid technologies promise to change the way organizations tackle complex

computational problems. However, the vision of large scale resource sharing is

not yet a reality in many areas-grid computing is an evolving area of computing,

where standards and technology are still being developed to enable this new

technology.

Advertisment

Need for a grid

Science has advanced by leaps and bounds and has grown more dependent on
computational power for research and analysis. While a powerful machine was

enough to analyze or compute whatever data, say a Pharma researcher had a decade

ago; things have changed a lot. Specifically in areas such as medical research,

nuclear physics, molecular studies, etc. For example, the amount of data that

scientists download from satellite monitoring activities in outer layers of

atmosphere goes up to approx 200 GB daily. Now you might realize the kind of

giant processing power you would need to consume data recorded over say a week

and perform computations on it. It has to be huge and powerful. This is one of

the reasons scientists demanded a system powerful enough and with near infinite

storage that could easily perform computation on the kind of data they

accumulate. It is scenarios like this which lead to the need for Computational

Grid. Rest as they say is history.

Grid architecture

Much like the Electric Grid from where the idea of Computational Grid came, the
architecture is a layered one. Thus we have grid applications as the top most

layer that might be scientific, engineering, and commercial or even web portals.

The next layer is that of the grid environment and tools. This layer provides

the libraries, runtime interfaces, even compilers and most importantly

parallelization tools. Next comes the layer which is rather a vendor specific

implementation, the Grid Middleware. This layer is in-charge of all the resource

management, scheduling services, job submission, storage access, and info

services across the entire grid. The middleware can further be segregated as a

layer comprising two sub layers. Some conceptualize two different layers. The

User-level middleware which takes care of the first two of all the tasks we

mentioned for middleware. The second one, Core Grid Middleware that handles the

latter four. Now since the grid will be using Internet as the communication,

computation and in-fact storage infrastructure and will be communicating or

connecting to clusters/grids across geographies; a Security Layer becomes

indispensible. Also referred to as the Security infrastructure, this layer

provides authentication and secure communication. The bottom most layer is the

'Grid Fabric' which is nothing but the existing 'network of networks' and its

components, clusters running on various OS, storage devices, databases and even

specific devices such as sensors.

Grid Architecture

Grid application

Science, engineering, commercial applications, Web portals

Grid programming environments and tools

Languages, interfaces, libraries, compliers, parallelization tools

User-level middleware—resource

aggregators

Resource management and scheduling services

Core grid middleware

Job submission, storage access, info services, trading accounting

Security infrastructure

Single sign-on, authentication, secure communication

Grid fabric

PCs, workstations, clusters, networks, software, database, devices

Advertisment

How it works

At the heart of the Grid is what we call the broker. We can describe the working
of the Grid at a rather abstract level as follows. Once a job is submitted for

operation in a Grid, the broker discovers resources that the user can access

through 'Grid Information Servers.' It then negotiates with grid-enabled

resources or their 'Agents' using middleware or middleware services, maps these

to the resources (also known as scheduling in Grid context) and then stages the

data for processing or application to be run. This last step is referred to as

'Deployment' in Grid context. The broker finally collects results. It monitors

the application's execution progress also. It also takes care of changes in the

Grid structure and resource failures.

Grid Vs Cluster computing

In a grid environment, we have a loosely coupled architecture of systems

connected majorly over a Wide Area Network or an Internet. The job is more or

less the same as is done by a Computational Cluster, which is to harness

resources of multiple ideal machines. But in case of a Grid it's not necessary

that it will only leverage the processing power of all the machines. You can

instead create a Data Grid which actually creates and manages distributed data

storage and is also called a Grid.

Advertisment

The other key feature of a Grid which actually differentiates it from a

Cluster is its de-centralized model, where you generally don't have a controller

in place and each and every node works independently. In this case the nodes can

also be heterogeneous in terms of Operating Systems hardware architecture.

One example of grid computing is the infamous SETI@home project to search for

extraterrestrial intelligence. There is a centralized telescope which captures

radio signals from space and then transfers the data captured in small packets

to several million computers connected to the Internet. The nodes then process

these packets of data in their idle time and return the results back to a data

center. This way high processing power is

obtained, utilizing the idle time of several computers spanning across the
globe.

In this example you can clearly see that the architecture is completely

de-centralized and loosely coupled. And is also very highly heterogeneous

because over the Internet one can't control which OS or architecture will a node

be using.

Advertisment

Clusters on the other hand use a single server or controller to manage and

distribute/aggregate the processes and one or more client nodes connected via a

tightly coupled environment such as a high speed LAN or some specialized high

speed interconnect such as Myranet, etc. But, unlike grid computing, where each

client computer can run its own OS, this one is controlled and managed by a

single OS running across the computers in the cluster, making it highly

homogeneous in nature. The server provides various files to clients for

execution. Applications are run on clients using parallel processing algorithms.

The clients are just dumb terminals, with no display in most of the cases or

input devices connected to them. The server is the single interface for the

entire system, where all input and output takes place. To the user the entire

setup appears as a single system. These formations of clusters are commonly

known as SSI or Single System Image.

Beowulf clusters, which are built from commodity 'off the shelf' computer

parts running free OSes like Linux, are an example of such a kind of cluster.

They provide very cost-effective parallel processing.

Advertisment

Grid in the Enterprise

Enterprises have their own complex applications and huge repositories of data

which also require high if not mammoth (as is the case with scientific data)

computational power to analyze. And not surprisingly, vendors like Sun

Microsystems, Oracle, Fujitsu, and Informatica as well as others have started

utilizing and implementing grid based solutions to tackle diverse issues. For

example, Sun and Informatica are providing grid computing based solutions for

data centric needs of organizations. They also provide data integration using a

grid. By using a grid for data centric needs brings with it major advantages

such as high availability, automatic recovery, adaptive load balancing where-in

load balancing works on the basis of situation at hand, and also sessions on

Grid. Similarly Oracle's grid implementations cover a wide range of services for

the enterprise.

The most interesting one from these is the grid solution for SOA runtime

governance and SOA infrastructure monitoring. Now this is really interesting

because as you would know and as we have gone on record saying that SOA

implementations more often than not bring together a variety of systems,

components, and applications under one roof. Implementing a grid control for SOA

runtime governance would make runtime recording of service requests, monitoring

the complex process flows and similar tasks easier and more manageable due to

the high grade computation power that grid provides. Other than this their grid

solution also supports identity management, and the other wise cumbersome task

of application server cluster deployment.

Advertisment

With the grid making a steady progress into enterprises, for primarily

smoothing out management or deployment of very large implementations, these

technologies can surely address a lot more pain areas if carefully matured over

time. After all, who would not want their processes, analytics or even data

needs to be not limited by computational power or storage considerations.

Emerging trends

Let's now consider some of the latest trends in this sphere.

P2P Grid: One of the newest technologies in grid computing is P2P

grids. We talked about it in the December issue of PCQuest in detail and also

showed how to implement it. P-Grid is essentially a grid that runs over P2P

connections. Both the data transfer and the CPU cycle migration are done over

P2P. Currently, the framework being used runs on Gnutella network.

Between Nov 97 and Feb 06,

PrimeNet Grid has handled 11,579,649,914 P90 machine-hours. Its throughput

rate can be characterized by a fitted, exponential trend line

Although, there is no full-fledged application available which can leverage

such a concept, you can use an application called GPU (downloadable from http://gpu.sf.

net). This application is still an alpha and can only run some test applications

such as Image Rendering, Net Crawling, etc.

But imagine what will happen when this technology matures. Any one with a

machine and an Internet connection can become a part of a public Grid and share

processing power the same way as we share MP3 and music files today. So, in that

case we will truly be able to achieve Internet computing or rather Internet

Super Computing.

Grid management: You must have heard about many types of grids and

clusters and read about them in PCQuest, such as heterogeneous Grid Platform

called Condor, or Globus or some simple clustering middleware such as SSI-based

like OpenMosix and MPI-based ones like Oscar and Flash Mob etc. If you search

over the Net, you will find there are quite a few different kinds of grid

products available. Some have a graphical front end to monitor the nodes and

some even don't have one. Let's take a classic example, OpenMosix.

In a matter of 5 mins, we

were able to connect to a P2P grid with 10 GB of RAM and 10 GFlops of

processing power using GPU

This one has a graphical monitoring application called OpenMosixView, but

have you ever noticed that if the number grows to something around a hundred

nodes, then how difficult it becomes to monitor? Plus, it only shows you the

current RAM and CPU utilization of the nodes. What about the disk usage? Or if

in case, you want to see what the CPU utilization was in the last one hour or

day, then?

These are things which are very difficult to monitor in case of large grids

or clusters. To make things worse, let's say you have multiple grids, one based

on Condor and the other one on Globas. Another one could just be a cluster using

Oscar or ROCKS with MPI support. And you want to monitor both of them from one

place. Then, what will you do?

Let's take a case of a cluster or a grid with hundreds and thousands of nodes

over a wide geographical distribution. Managing them all from one place can be

really difficult. So, this is one area that is picking up on the Grid technology

front. The most common and popular tool out there which solves this purpose is

Ganglia. We have talked about this in detail in our June 2006 issue. And

this is the one being used by most of the biggies using Grid technologies

such as NASA, CRAY, SUN, Boeing, US Air Force and Microsoft.

Glossary

Cluster Interconnect: A very high speed connection

allowing computers in a cluster to interconnect. Enterprise Grid Alliance: A

vendor-neutral, open and independent organization that works as a consortium

for focusing on obstacles enterprises face in grid implementations, and

promoting open and interoperable solutions for problems.

Enterprise

Grid: A collection of networked components including systems,

applications like CRM, ERP etc. usually managed by a distinct business

entity providing a set of services and assignment of resources to these

services for accomplishing business goals.

N1: Sun's architecture for next-generation data-center that makes

the entire data center work as one single, unified system. It reduces

management and costs, increases the data-center resource utilization,

infrastructure responsiveness, and agility.

Utility Computing: 'PAY-AS-YOU-GO' model of computing analogous

electricity usage. Instead of paying for computing resources to handle peak

load all the time it requires paying only for the computing used.

Utility Data-Center: An infrastructure solution proposed by HP

that allows virtualization of computing resources for the data center. The

Utility Data Center includes servers, storage, and networking products that

are integrated and deployed by intelligent management software that allows

them to be shared and dynamically re-provisioned to accommodate changing

workloads.

Advertisment