[Beowulf] Parallel Programming Question

Mark Hahn hahn at mcmaster.ca
Wed Jun 24 08:44:03 PDT 2009


> In an mpi parallel code which of the following two is a better way:
>
> 1)      Read the input data from input data files only by the master process
> and then broadcast it other processes.
>
> 2)      All the processes read the input data directly from input data files
> (no need of broadcast from the master process). Is it possible?.

2 is certainly possible; whether it's any advantage depends too much
on your filesystem, size of data, etc.  I'd expect 2 to be faster only
if your file setup is peculiar - for instance, if you can expect all
nodes to have the input files cached already.  otherwise, with a FS 
like NFS, 2 will lose, since MPI broadcast is almost certainly more 
time-efficient than N nodes all fetching the file separately.

but you should ask whether the data involved is large, and whether 
each rank actually needs it.  if each rank needs only a different 
subset of data, then reading separately could easily be faster.



More information about the Beowulf mailing list