[Beowulf] Please help to setup Beowulf

Reuti reuti at staff.uni-marburg.de
Fri Feb 20 12:58:26 PST 2009


Am 20.02.2009 um 15:37 schrieb Bogdan Costescu:

> On Fri, 20 Feb 2009, Glen Beane wrote:
>
>> I looked into SGE a long time ago, but I found the MPI support  
>> terrible when compared to TORQUE/PBS Pro
>
> Indeed and AFAIK is still in a similar state today. There was talk  
> for a long time on the SGE devel list for a TM API to be added, but  
> it seems like this is not considered a high priority feature.

This is just, as they have a replacement called qrsh for the ususal  
rsh/ssh calls (as you know, but maybe others on the list not).  
Although it was in former times just using a special version of rsh,  
it was in the end under full control of SGE. In such a setup, the  
tradititonal rsh/ssh can be disabled completely inside the cluster  
(or ssh just limited to admin staff).

Nowadays it's replaced by a builtin startup method which is more  
scalable.

Having both, a TM and a tight integrated RSH/SSH replacement would of  
course be the best. Linda (which is Gaussian's parallel library)  
starts only with rsh/ssh. I see sites, having exactly for this  
purpose a "cleaner" script running in their Torque operated cluster  
to get rid of such kinds of jobs, as Torque can't know, what was  
started by rsh/ssh on some nodes.

> I've not only looked but actually used SGE for about 1 year (IIRC,  
> about 5 years ago) during which I had to spend time fixing the  
> interactions with LAM/MPI and many of the parallel applications  
> that were used on that cluster - and finally gave up.

It's successor Open MPI calls qrsh directly, when it discovers that  
it's running under SGE. It just checks some environment variables.

-- Reuti


> On the plus side, during the time that SGE was used, I have never  
> seen a process left behind from a job and the queueing system  
> itself seemed very stable - something that I could not say for the  
> OpenPBS/Torque that I've also tested at that time.
>
> -- 
> Bogdan Costescu
>
> IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
> Phone: +49 6221 54 8240, Fax: +49 6221 54 8850
> E-mail: bogdan.costescu at iwr.uni-heidelberg.de
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list