Scyld + myrinet mpich-gm?

Sat Feb 3 21:15:58 PST 2001

I've gotten myself involved in bringing a small cluster up and
into production.  I'm learning as I go, with the help of the
archives of this mailing list.  Unfortunately the searchable
archives at Supercomputer.org seem to be off line (I get internal
server error), and out of date (the last messages seem to be from
around May 2000).

The current setup is one master with 100base-T to the world, gigabit
fiber to a 16-10/100 + 2-1000 switch, and 12 diskless slaves with
10/100 and myrinet interfaces.  The Scyld release of last Monday is
up and running, and I can bpsh to my heart's content.

I'm stuck at the point of trying to deploy MPI.  Scyld supplies mpi-beowulf
which does not appear to me to use bproc, and /usr/bin/mpirun and mpprun
which do.  I've built the mpich-gm from Myricom, but their mpirun command
does not grok bpsh, and expects either rsh or ssh daemons on each slave.

I've tried a number of approaches that start out looking like they might
work, but have gotten stuck after a few hours down each cowpath.

Here is a list of some of the snags (I've lost track of some others):

bpsh is not a full blown shell, doesn't deal well with redirection, changing
directory before running a command, and in particular it can't be swapped for
rsh or ssh when configuring mpich (ie -rsh=bpsh).

The master node is outside the myrinet, I haven't a clue how to get
it to cooperate with the slaves over ethernet yet have the slaves
use myrinet as much as possible.

I tried hacking on the first test in mpich-1.2..4/examples/test
(pt2pt/third) that you get when you do make testing or runtests -check.
Tried to get it to use /usr/bin/mpirun.  Had to get rid of -mvhome and
-mvback args first, then tried to use bpsh to start up the mpirun on
one node, hoping it could use GM to start up on the other slaves.
After creating the directory in /var where it could create shm_beostat,

Now I get truckloads of errors:
shmblk_open: Couldn't open shared memory file: /shm_beostat
shmblk_open failed.

I suppose these might be from the other nodes, expecting everyone is
sharing /var, but I'm leery of nfs mounting all of the master's /var
on each slave.

I tried applying the Scyld patches against the 1.2.0 mpich sources to
the 1.2..4 sources from Myricom, but most of them went into the mpid/ch_p4
directory, which is not built when --with-device=ch_gm is specified.

Then I thought I'd look into the mpprun sources, but I couldn't get
them to build even before I started hacking on them... decided to look
elsewhere for a while.

Tried getting sshd2 up and running on a slave node.  So far it insists
on asking for my password and won't accept it at all.

Has anyone got a working cluster anything like the one we're building?
What did you have to do differently to make the various packages and
drivers play nice with each other?  Where did I go wrong?

Thanks,

	-- ddj

	Dave Johnson
	ddj at cascv.brown.edu
	Brown University TCASCV