PVM Problems

Robert G. Brown rgb at phy.duke.edu
Mon Apr 30 15:45:20 PDT 2001


On Mon, 30 Apr 2001 JParker at coinstar.com wrote:

> G'Day !
>
> I am running PVM (I believe it is the 3.4 version) on a Debian 2.2 system
> using Bash for my shell.  I configured it as per the man pages and the MIT
> reference book, except I defined PVM_ROOT and PVM_ARCH in .bashrc on every
> node (including the head server) and not the .cshrc .  Should I put in my
> .bash_profile on the head node ?  I also have my .rhosts correct, as
> verified by running rsh and ssh.
>
> The symptoms are that when I run the pvm shell, it starts correctly on the
> server, as verified by running administrative commands, but when I try to
> add a node, it returns a "terminated" and puts me back at my bash shell
> prompt.  I can then rsh or ssh over to the node and verify the pvmd is
> running on the node via the ps command.  When I try to use the pvm shell
> on that node, it displays a notice that the pvmd is already running and
> hangs till I kill the process.  Any clues or suggestions ?

The latest version of pvm has a debug facility that should give you a
verbose trace of its efforts starting client pvmd's IF they fail.  It
also has a much more tedious debug mode (pvm -d1) that gives you a trace
of its activities on the pvm level.  You need to be running pvm 3.4.3 --
nothing less will do to get the automatic console debugger.  See:

http://www.beowulf.org/pipermail/beowulf/2000-April/009148.html

(for example) for a partial discussion.

As for your particular problems:

Remember that bash and csh environments are DIFFERENT in remote shells
(although they are at least partly set by the pvm command itself, which
is just a shell script in /usr/bin/pvm in the current PVM rpm).  If, for
example, you were setting PVM_ROOT in /etc/profile expecting it to be
inherited by all rsh processes, it won't be (read the "INVOCATION"
section of man bash).  It should be set correctly if you set the
variable in ~/.bashrc, which is sourced on non-login interactive (and
most remote) shells.

One reason that I like using ssh for PVM is that the original ssh
included an automatic scan of /etc/environment to set any environment
variables for all users doing an ssh.  OpenSSH looks like it might have
modified that to ~/.ssh/environment, which is less useful (it involves
modifying or installing a file in a user's home directory, no big deal
but...)

> Another question is that the docs say there should be a lock file in /tmp,
> but what I find is a directory containing a log file (and of course no
> indication why I can not add the node).  Is this the normal setup ?  BTW,
> I can restart pvm at on the server and launch another instance of the
> dameon on the node with similar results as stated above.

Yes there should be a lock file.  It typically is something like
"/tmp/pvmd.##" where ## is your uid.  It isn't unheard of for pvm to get
"confused" and fail to remove the lockfile on termination, in which case
one needs a script or proficiency with foreach to ssh commands to all
nodes to clear the bogus lockfile.  Debug mode will tell you if
lockfiles are your problem.

> Final question (for now), how do I set PVM up to use ssh instead of rsh ?

An excellent one.  Note:

rgb at ganesh|T:1658>printenv | grep PVM
PVM_ROOT=/usr/share/pvm3
PVM_RSH=/usr/bin/ssh
XPVM_ROOT=/usr/share/pvm3/xpvm

As of 3.4.3(?) or thereabouts, they added a PVM_RSH environment variable
to allow you to specify your rsh of choice without a recompile.  A lot
of this is specified in /usr/share/pvm3/Readme (if you either installed
from a prebuilt RPM or (I presume) DEB or built it yourself with the
sources there).  There is a bit of help on troubleshooting there, for
example.

SO, my overall advice would be:

  upgrade to 3.4.3 (or the latest one, whatever it might be).  There are
definitely .rpms, probably .debs, for this version.  Or it isn't too
hard to build.

  this will give you PVM_RSH and a console debugger

  the PVM_RSH variable (plus reading the Readme and or checking out the
various online documentation sources) will help lead you through using
ssh instead of rsh -- the Readme will offer advice on how to verify that
ssh (or rsh) are correctly installed.

When you get here if you still have trouble ask again and I'll see if I
can help you over the next step.

   rgb

>
> cheers,
> Jim Parker
>
> Sailboat racing is not a matter of life and death ....  It is far more
> important than that !!!

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu







More information about the Beowulf mailing list