Scyld, local access to nodes, and master node as compute node

Sean Dilda agrajag at scyld.com
Thu May 24 11:37:28 PDT 2001


On Thu, 24 May 2001, Brian C Merrell wrote:

> On Thu, 24 May 2001, Sean Dilda wrote:
> 
> > Is there any reason the program itself can't run itself in the special
> > way they want?  Anything you can do with rlogin or rsh can be done with
> > bpsh, except for an interactive shell.  However, this can be mimiced
> > through bpsh.  If you can give me some idea of what they are wanting to
> > do, I might be able to help you find a way to do it without requiring an
> > interactive shell.  Scyld clusters are designed to run background jobs
> > on all of the slave nodes, not to run login services for users on the
> > slave nodes.
> >
> 
> Hmmm.  I guess this warrants some background info.
> 
> The cluster is not a new cluster.  It was previously built by someone else
> who is now gone.  The cluster master node crashed, taking the system and
> most of their data with it.  I am now trying to rebuild the cluster.  The
> cluster previously used RH6.1 stock and followed more of a NOW model than
> a beowulf model, although all the hardware was dedicated to the cluster,
> not on people's desks.  I'm now trying to use Scyld's distro to bring the
> cluster back up.  I'm pretty happy with it, and managed to get the master
> node up with a SCSI software RAID array, and a few test nodes up with boot
> floppies.  Seems fine to me.  BUT....
> 
> There are three reasons that they want to be able to rlogin to the
> machines:  1) first, there are a number of people with independent
> projects who use the cluster.  They are used to being able to simply login
> to the master, rlogin to a node, and start their projects on one or more
> nodes, so that they take up only a chunk of the cluster.  2) Also, at
> least one researcher was previously able to and wants to be able to
> continue to login to separate nodes and run slightly different (and
> sometimes non-parallelizable) programs on his data.  3) ALSO, they have
> code that they would rather not change.

Ok, I understand now.  All of these things can be handled with bpsh.
Do you think these people will be happy with doing something like 'rsh
<node> <command>' instead of rsh'ing in to get a shell and then run the
command?  If so, you could probablly get away with just symlinking
/usr/bin/rsh to /usr/bin/bpsh
> 
> > It is possible to use BProc with a full install on every slave node
> > however this reduces a lot of the easy administration features we've
> > trying to put into our distro.
> >
> 
> I just set this up, and realize what you mean.  I had to statically define
> IP addresses, users, etc.  At first it wasn't a pain, but I realized after
> the first two that doing all 24 would be.  Even though it is now possible
> to rlogin to different nodes, it wasn't what I was hoping for. I imagine
> it will be particularly unpleasant when software upgrades need to be
> performed. :(

This is one of the advantages of our software.  It is setup in such a
way that you don't have to do so much work to keep the slave nodes up to
date.

> 
> I'm still hoping to find some happy medium, but I'm going to present these
> options to the group and see what they think.  The problem is that they
> are mathematicians and physicists, not computer people.  They really don't
> want to have to change, even though it seems to be the same.
> 
> Also one thing I'm still trying to find a solution to: how can the nodes
> address each other?  Previously they used a hosts file that had listings
> for L001-L024 (and they would like to keep it that way) I guess with the
> floppy method they don't have to, because the BProc software maps node
> numbers to IP addresses,

Perhaps you could write some sort of rsh replacement script that turns
the L001-L024 names into the BProc node numbers, then call bpsh.  Would
that be a happy medium?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 232 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20010524/26758715/attachment.sig>


More information about the Beowulf mailing list