[clug] Sun Grid Engine (was Re: Two (perhaps simple) shell questions.)

Andrew Janke a.janke at gmail.com
Wed Aug 26 19:26:08 MDT 2009

> So... dragging this to another topic, I presume that is the Sun Grid Engine.


> Generally speaking, how do you find working with the Sun Grid Engine?  Our
> target machines would be Debian Linux, either etch or lenny, at this point.

Works and works well for batch and parallel processing. There is a bit
of a black art to setting up queues and cluster resources to make
things work well for your particular requirements but the default FIFO
works well to get you going.

> Does it handle data placement or transport well?  The documentation seems
> light on the subject, but I have not looked /that/ long into it.

It can (typically via authenticated SSH keys and the likes). In our
case I have always just run a local gig subnet (generally on a
separate interface) with NFS so that all machines look the same. This
is done using automount, all machines mount each other and the master
fileservers, data is spread around in an attempt to limit NFS load on
the main server. A typical SATA->PCIe RAID system can handle about 100
cluster node instances with NFS mounts before things start to go
wonky. All processing is generally like this:

   1. Copy data to be worked on to /tmp
   2. Do stuff.
   3. Copy back.

It tends to be a bad idea to run everything over NFS as the server
will choke when 100 jobs fire up at once each copying 200MB of data.

> Also, it looks like it runs, kind-of, on Windows, as long as we have Interix
> in place.  Since we need Windows, and to drive Microsoft Office via OLE
> Automation, does that work with SGE?

Sort of...   Meaning you want to do batch processing on windows
machines on Office things via OLE? Surely there is a M$ widget to do
this? In the past I have tried many iterations to use up the spare CPU
cycles on Windows machines in the department but this isn't quite the
same thing.  More detail required!

> Finally, do you have anything to say on the topic of SGE vs Condor for this
> sort of work?

Well for what we do (Automated medical image analysis) SGE is the
better choice.

Andrew Janke
(a.janke at gmail.com || http://a.janke.googlepages.com/)
Canberra->Australia    +61 (402) 700 883

More information about the linux mailing list