[Beowulf] Monkey Business

Luc Vereecken Luc.Vereecken at chem.kuleuven.ac.be
Wed Mar 15 04:29:05 PST 2006


At 08:24 PM 3/14/2006, Robert G. Brown wrote:

..snip...
>Using PVM is fairly easy to write programs that CAN be tuned by hand or
>dynamically to a heterogeneous environment.  In fact, PVM can actually
>run a single program on multiple architectures -- one of my first
>exposures to it was one of Vaidy's presentations in which he showed
>scaling results for a computation that was being run in parallel across
>a Cray, a cluster of Sun workstation, a cluster of DEC workstations
>(this WAS 1992 and DEC still existed:-), and a cluster of I think HPs or
>AIX boxes, cannot remember.  One computation, four or five distinct
>binaries, ethernet for all IPCs.  Tres cool.
>
>Even today, I'm not at all certain that a version of MPI exists that
>>>can<< do this.  Sure, with Linux nearly ubiquitous in production
>cluster environments there is less incentive than there was a decade
>plus ago, but even now PVM "could" be used to run a single computation
>across e.g. i386 and x64 architectures, using native binaries on both
>(not i386 compatibility binaries and libraries on both).  PVM also gives
>you fairly straightforward control over just how the job distributes
>itself, permitting you (with some effort) to invoke multiple instances
>of a job per node, respawn a worker task on a crashed node, and so on.

..snap...

I used to do this type of stuff with mpich: AIX machines (2 versions, 
2 types of power chips, some single proc other 2way SMP), solaris 
(with 2-way smp ultrasparc cpus), linux machines (several 
distributions) on a variety of intel and amd machines, and once 
during a test a Dec machine (only used it once to test this, can't 
even remember the details. Was a 4-way smp machine. Might even have 
the DEC-part wrong) to run distributed programs across the network. 
Nice mix of bit-endian and little endian :-) The jobs started in a 
parallel queue on the SP2 cluster of the computation center, spread 
out to our own machines here at chemistry, some interactive machines 
on the SP2 where I was allowed to run jobs (it was not _explicitely_ 
forbidden to have them started by a script or a job inside the 
cluster so...) and some machines made temporarily availble for these 
tests/runs at other research groups. Worked very nicely, no problems 
in the communications, and since the algorithm I used was 
self-balancing it scaled pretty good too. I gather Mpich-2 no longer 
supports this type of heterogeniety nowadays ?

Luc Vereecken



Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm




More information about the Beowulf mailing list