Clusters (was CLUG meeting 23 May 2002)

Ian McCulloch ianmcc at lorentz.leidenuniv.nl
Thu May 23 06:51:48 EST 2002


On Wed, 22 May 2002, Richard Cottrill wrote:

> > I don't know specifically what you
> > refer to with
> > the comparison to a beowulf, I wouldn't be at all surprised if there is a
> > grid-ified version.  Do you have any links?
> 
> I did a quick google for a brief beowulf intro to reassure myself that I'm
> not talking complete crap and I found http://www.beowulf.org/intro.html
> 
> I think it's a cluster of dedicated machines; most likely using commodity
> hardware. I think it also implies Linux as the OS. The article above is
> careful to point out that each PID on the cluster is unique - allowing
> easy/efficient IPC between nodes.
> 
> From your description of a grid it sounds much like mosix. From skimming the
> Globus intro it looks like one significant difference is that mosix doesn't
> (yet) support adding nodes on-the-fly (so far as I can tell) and it doesn't
> support heterogeneous architectures (yet). It does look like 'grid' is just
> a way of describing a complete cluster computing environment of a certain
> variety. If I were a cynic I'd say it smells like a bit of marketing jargon
> (aka buzzword) for cluster.

As seems unfortunately ubiquitous nowdays, sure there is a lot of 
marketing jargon on globus.org.  But the idea behind the grid is much 
bigger than clusters (especially beowulf clusters); it really aims to 
solve the problem of how to process really huge datasets coming mostly 
from bioinformatics (gene analysis etc), but also from particle 
accelerators and so on.  We are (or soon will be) talking about terabytes 
of data per day here.  This needs to be distributed to computing labs on 
various continents for processing, and then people need to be able to 
access the results.  The idea is that every step of the processing 
pipeline (including, in some cases, the experiment itself) is a part of 
the 'grid', with one set of protocols, allowing homogeneous, 
single-sign-on access to everything.  Vaguely similar to what M$ hopes to 
achieve with .NET web services (but minus the world domination bit ;-), 
combined with groupware scaled up to gigabyte data sets.  Real
interoperability, which does not mean running a virtual machine on your 
supercomputer!  Software for individual clusers is only a small part of 
this, and not even the main focus.

This is very vague, I'm not familiar at all with the details, I only know 
a broad outline of the ideas behind the grid from a few talks at a 
HPC conference I was at last year and even then I'm far from convinced I 
grasped the real idea.  Maybe there is someone at ANUSF/APAC that knows 
more?  Possible CLUG talk?

Cheers,
Ian





More information about the linux mailing list