mpi-prog porting from lam -> scyld beowulf mpi difficulties
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Peter Beerli beerli at genetics.washington.eduThu Nov 29 13:19:33 PST 2001
- Previous message: mpi-prog porting from lam -> scyld beowulf mpi difficulties
- Next message: mpi-prog porting from lam -> scyld beowulf mpi difficulties
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jim,
the buffer in broadcast_data_master gets allocated to the size needed
in pack_data_buffer() [which returns the allocated size of buffer]
before the buffer is broadcasted.
Peter
On Thu, 29 Nov 2001, James Long wrote:
> The buffer allocation is only one byte in broadcast_data_master.
> Looks like you should make it big enough for all your data and
> options before you broadcast it, as there is no telling what might
> stomp that memory after you pack it and before it gets sent.
>
> Jim
>
> At 5:03 PM -0800 11/28/01, Peter Beerli wrote:
> >Hi,
> >I have a program developed using MPI-1 under LAM.
> >It runs fine on several LAM-MPI clusters with different architecture.
> >A user wants to run it on a Scyld-beowulf cluster and there it fails.
> >I did a few tests myself and it seems
> >that the program stalls if run on more than 3 nodes, but seems to work for
> >2-3 nodes. The program has master-slaves architectures where the master
> >is mostly doing nothing. There are some reports sent to stdout from any node
> >(but this seems to work in beompi the same way as in LAM).
> >There are several things unclear to me
> >because I have no clue about the beompi system, beowulf and scyld in
> >particular.
> >
> >(1) if I run "top" why do I see 6 processes running when I start
> > with mpirun -np 3 migrate-n ?
> >
> >(2) The data-phase stalls on the slave nodes.
> > The master node is reading the data from a file and then broadcasts
> > a large char buffer to the slaves. Is this wrong, is there a better way
> > to do that [I do not know how big the data is and it is a complex mix
> > of strings numbers etc.]
> >
> >void
> >broadcast_data_master (data_fmt * data, option_fmt * options)
> >{
> > long bufsize;
> > char *buffer;
> > buffer = (char *) calloc (1, sizeof (char));
> > bufsize = pack_databuffer (&buffer, data, options);
> > MPI_Bcast (&bufsize, 1, MPI_LONG, MASTER, comm_world);
> > MPI_Bcast (buffer, bufsize, MPI_CHAR, MASTER, comm_world);
> > free (buffer);
> >}
> >
> >void
> >broadcast_data_worker (data_fmt * data, option_fmt * options)
> >{
> > long bufsize;
> > char *buffer;
> > MPI_Bcast (&bufsize, 1, MPI_LONG, MASTER, comm_world);
> > buffer = (char *) calloc (bufsize, sizeof (char));
> > MPI_Bcast (buffer, bufsize, MPI_CHAR, MASTER, comm_world);
> > unpack_databuffer (buffer, data, options);
> > free (buffer);
> >}
> >
> > the master and the first node seem to read the data fine
> > but the others either don't and wait or silently die.
> >
> >(3) what is the easiest way to debug this? With LAM I just attached to pids to
> > in gdb on the different nodes, but here the nodes are transparent to me
> > [but as I said I have never used a beowulf cluster before].
> >
> >
> >Can you give pointers, hints
> >
> >thanks
> >Peter
> >--
> >Peter Beerli, Genome Sciences, Box #357730, University of Washington,
> >Seattle WA 98195-7730 USA, Ph:2065438751, Fax:2065430754
> >http://evolution.genetics.washington.edu/PBhtmls/beerli.html
> >
> >
> >
> >_______________________________________________
> >Beowulf mailing list, Beowulf at beowulf.org
> >To change your subscription (digest mode or unsubscribe) visit
> >http://www.beowulf.org/mailman/listinfo/beowulf
>
>
--
Peter Beerli, Genome Sciences, Box #357730, University of Washington,
Seattle WA 98195-7730 USA, Ph:2065438751, Fax:2065430754
http://evolution.genetics.washington.edu/PBhtmls/beerli.html
- Previous message: mpi-prog porting from lam -> scyld beowulf mpi difficulties
- Next message: mpi-prog porting from lam -> scyld beowulf mpi difficulties
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
