[Beowulf] MPICH on heterogeneous (i386 + x86_64) cluster
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Egan Ford egan at sense.netMon Jan 24 10:52:34 PST 2005
- Previous message: [Beowulf] MPICH on heterogeneous (i386 + x86_64) cluster
- Next message: [Beowulf] MPICH on heterogeneous (i386 + x86_64) cluster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I have tried it and it did not work (i.e. i686 + x86_64). I also did not
spend a lot of time trying to figure it out. I know that this method is
sound, it works great with hybrid ia64 and x86_64 clusters.
Below is a .pbs script to automate running xhpl with multiple arch. Each
xhpl binary must have a .$(uname -m) suffix. This was done with Myrinet.
The resulting pgfile will look like this (node14 really has 2 procs, but
since mpirun started from node14 it already has one processor assigned to
rank 0, so the pgfile only needs to describe the rest of the processors).
node14 1 /home/egan/bench/hpl/bin/xhpl.x86_64
node10 2 /home/egan/bench/hpl/bin/xhpl.ia64
node13 2 /home/egan/bench/hpl/bin/xhpl.x86_64
node9 2 /home/egan/bench/hpl/bin/xhpl.ia64
Script:
#PBS -l nodes=4:compute:ppn=2,walltime=10:00:00
#PBS -N xhpl
# prog name
PROG=xhpl.$(uname -m)
PROGARGS=""
NODES=$PBS_NODEFILE
# How many proc do I have?
NP=$(wc -l $NODES | awk '{print $1}')
# create pgfile with rank 0 node with one less
# process because it gets one by default
ME=$(hostname -s)
N=$(egrep "^$ME\$" $NODES | wc -l | awk '{print $1}')
N=$(($N - 1))
if [ "$N" = "0" ]
then
>pgfile
else
echo "$ME $N $PWD/$PROG" >pgfile
fi
# add other nodes to pgfile
for i in $(cat $NODES | egrep -v "^$ME\$" | sort | uniq)
do
N=$(egrep "^$i\$" $NODES | wc -l | awk '{print $1}')
ARCH=$(ssh $i uname -m)
echo "$i $N $PWD/xhpl.$ARCH"
done >>pgfile
# MPICH path
# mpirun is a script, no worries
MPICH=/usr/local/mpich/1.2.6..13/gm/x86_64/smp/pgi64/ssh/bin
PATH=$MPICH/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/goto/lib
set -x
# cd into the directory where I typed qsub
if [ "$PBS_ENVIRONMENT" = "PBS_INTERACTIVE" ]
then
mpirun.ch_gm \
-v \
-pg pgfile \
--gm-kill 5 \
--gm-no-shmem \
LD_LIBRARY_PATH=/usr/local/goto/lib \
$PROG $PROGARGS
else
cd $PBS_O_WORKDIR
cat $PBS_NODEFILE >hpl.$PBS_JOBID
mpirun.ch_gm \
-pg pgfile \
--gm-kill 5 \
--gm-no-shmem \
LD_LIBRARY_PATH=/usr/local/goto/lib \
$PROG $PROGARGS >>hpl.$PBS_JOBID
fi
exit 0
> -----Original Message-----
> From: beowulf-bounces at beowulf.org
> [mailto:beowulf-bounces at beowulf.org] On Behalf Of Sean Dilda
> Sent: Friday, January 21, 2005 7:42 AM
> To: cflau at clc.cuhk.edu.hk
> Cc: beowulf at beowulf.org
> Subject: Re: [Beowulf] MPICH on heterogeneous (i386 + x86_64) cluster
>
>
> John Lau wrote:
> > Hi,
> >
> > Have anyone try running MPI programs with MPICH on
> heterogeneous cluster
> > with both i386 and x86_64 machines? Can I use a i386 binary
> on the i386
> > machines while use a x86_64 binary on the x86_64 machines
> for the same
> > MPI program? I thought they can communicate before but it
> seems that I
> > was wrong because I got error in the testing.
> >
> > Have anyone try that before?
>
> I've not tried it, but I can think of a few good reasons why
> you'd want
> to avoid it. Lets say you want to send some data that's stored in a
> long from the x86_64 box to the x86 box. Well, on the x86_64 box, a
> long takes up 8 bytes. But on the x86 box, it only takes 4
> bytes. So,
> chances are some Bad Stuff(tm) is going to happen if you try
> to span an
> MPI program across architectures like that.
>
> On the other hand, the x86_64 box will run x86 code without a
> problem.
> So i suggest running x86 binaries (and mpich) libraries on all of the
> boxes. While I haven't tested it myself, I can't think of any reason
> why that wouldn't work.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe)
> visit http://www.beowulf.org/mailman/listinfo/beowulf
>
- Previous message: [Beowulf] MPICH on heterogeneous (i386 + x86_64) cluster
- Next message: [Beowulf] MPICH on heterogeneous (i386 + x86_64) cluster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
