[Beowulf] sun grid engine on Scyld beowulf cluster
dag at sonsorol.org
Thu Feb 17 14:12:07 PST 2005
I know Grid Engine well but not Scyld so forgive my ignorance if I say
something stupid and given the level of expertise on this list I'm quite
certain I'm about to make a fool myself :)
If Scyld is presenting you with a single system image (ie a single linux
server that can farm out tasks to all those nodes) then you would
install SGE in the same way that you would install it on a big SMP box:
1. Install the SGE qmaster and scheduler on the master node
2. Install the execution host on the master node as well
You will only have 1 execd per queue but each queue can be configured
with N number of "job slots" which actually control how many jobs can
run at the same time on the same machine.
Try setting your # of job slots within your single SGE queue to the
number of nodes in your cluster. This is simlar to what you would do on
a big SMP machine -- small number of queues each supporting a decent
Then submit a bunch of jobs and see if SGE causes the master node to
fall over under load. If not then Scyld is doing its thing behind the
scenes to migrate stuff around to the other nodes.
> I am in the process of installing SGE on a Scyld beowulf cluster. As
> most people are aware, the Scyld cluster runs a complete OS (linux) only
> on the master node and the compute nodes are simply for executing.
> During the SGE install, it requires adding the compute nodes as execute
> hosts. I do not understand how to do this given the current setup of a
> scyld cluster since you can't "login" to the nodes to execute the
> install script. The script does exist on an NFS shared directory
> (cluster wide). Has anybody else ran into this problem?
More information about the Beowulf