[Beowulf] quick and dirty method for starting job on another node?

Robert G. Brown rgb at phy.duke.edu
Tue May 3 09:12:00 PDT 2005

On Tue, 3 May 2005, Jim Lux wrote:

> I'm looking for a quick and dirty way for a process on node N to start a
> process on Node M, without having MPI, etc. running.  Needs to be a single
> line for use in, e.g. a script. SSH is available on the target nodes, but
> not a whole lot else. The nodes are using "busybox" for all the shell kinds
> of things.
> Performance isn't critical.

You mean other than

  ssh -x targetnode /pathto/taskname args...


(As in, doesn't your question contain its own answer?)

I enclose "taskmaster", a perl script I featured in one of my first CWM
columns.  It has two nifty features that may be of use to you.  One is
the "runtask" subroutine, which encapsulates the above for use in perl.

The other is the taskmaster main routine itself, which basically loops
over hosts and spawns independent threads (perl now supports real
threads) each of which contains an instance of runtask.

This isn't TOTALLY robust, but is pretty close and you could make it as
robust as you like.

So one of these things ought to work for you -- ssh directly, runtask
as a perl routine if the scripts your referring to happen to be in perl,
a hack of taskmaster itself if you want the threads and greater
robustness.  The threads are useful because of a change in ssh that
makes it relatively difficult to disconnect an ssh session with a
backgrounded task from its controlling tty to get back your shell.

Hope this helps.


> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

-------------- next part --------------
# $Id: taskmaster,v 1.7 2003/11/05 20:56:15 rgb Exp $

 use Config;
 use threads;

 # Set the path to the task from your ssh home directory
 my $taskpath = "Src/cworld/12_03/task/task";

 $Config{useithreads} or die "Upgrade to perl >= 5.8.0, compiled with threads";

 # Get required arguments (2) from command line
 $verbose = 1;
 if($ARGC < 4 || $ARGC > 5){
   Usage("Incorrect number or type of arguments");
 $hostfile = $ARGV[0];
 $nhosts = $ARGV[1];
 $nrands = $ARGV[2];
 $delay = $ARGV[3];
 if($ARGC == 5) { $verbose = 0; }

 # Get list of host names
 open(FD,$hostfile) || die "$0: can't open $hostfile";
 $i = 0;
 while(<FD>) {
   $hosts[$i] = $_;

 # Split up nrands precisely and lazily (outside timer).
 # This balances our "load".
 $nr = 0;
 $i = 0;
 while($nr < $nrands){
   $i %= $nhosts;

 # Start timer and spawn remote host task threads.
 $tstart = time;

 if($verbose){ print "\nSpawning host threads\n\n"; }
 for($i = 0;$i < $nhosts;$i++){
   $seed = $i + 1;
   $hostthread[$i] = threads->new(\&runtask,$taskpath,$hosts[$i],$seed,$nw[$i],$delay);
     print "Host $hosts[$i] thread running.\n";
 if($verbose){ print "\n"; }

 # Accumulate returns from each host task thread in @rands.
 # This will block until the last host completes.
 @rands = ();
 foreach $hostt (@hostthread){
   @rands = (@rands,split /\n/,$hostt->join);
 $tstop = time;

 # Print out results and timing.  Don't time the printout.
   for($i = 0;$i < $nrands;$i++){
     print "rand[$i] = $rands[$i]\n";
   print "\n";
 $ttime = $tstop - $tstart;
   printf("%8s %8s %8s %8s\n","nhosts","nrands","delay","time");
 printf(" %5d  %8d %8d %8d\n",$nhosts,$nrands,$delay,$ttime);


sub runtask {

 my $taskpath = shift;
 my $host = shift;
 my $seed = shift;
 my $nwork = shift;
 my $delay = shift;
 my $task = "/usr/bin/ssh -x $host $taskpath $seed $nwork $delay";
 $rand = `$task`;
 return $rand;


sub Usage {

 my $message = shift;
 if($message) {print STDERR "Error: $message\n";}
 print STDERR 

taskmaster hostfile nhosts nrands delay [q]

 hostfile is a file that contains hostnames, one per line

 nhosts is the number of these hosts you wish to use

 nrands is the number of random numbers you wish to generate
 in parallel.

 delay is the number of seconds the worker task will sleep (simulating
 work) between each random number it generates.

 q(uiet) surpresses verbosity for trapped output.


More information about the Beowulf mailing list