[Beowulf] multi-threading vs. MPI
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Daniel Pfenniger Daniel.Pfenniger at obs.unige.chFri Dec 14 08:07:22 PST 2007
- Previous message: [Beowulf] multi-threading vs. MPI
- Next message: [Beowulf] multi-threading vs. MPI
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Greg Lindahl wrote: > On Wed, Dec 12, 2007 at 05:28:37PM -0500, Robert G. Brown wrote: > >> That's the debateable point I understand, but is it being asserted that >> it is NEVER going to be sensible to use OpenMP in favor of MPI or just >> that it is most LIKELY going to be smarter to use one or the other? > > The second. And that many people have wasted time when they make a > code do both. > > -- greg > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf My experience tells me: it depends. What I have seen in clusters of SMP nodes is that one first may well develop a pure MPI code that scales well when running 1 process per node. At this stage the processes enjoy maximum network capacity, RAM space and disk, but many CPUs stay idle. The options to make use of these CPUs are: 1) Run several processes per nodes keeping the MPI code unchanged. Depending on the code and cluster characteristics, scaling may drop however due to the shared network capacity, RAM space, or disk. 2) Keep 1 process per node but use OpenMP within local processes. Depending on the type of code this may provide better speed-up than 1). At least it should improve performance wrt 1 process per node. In summary my recommendation would be to parallelize as much as possible at high level with MPI only. But if network, RAM or disk would become bottlenecks when running several processes per node, parallelize the code with OpenMP. Such a nested parallelism can be easily ported on different SMP node clusters with different characteristics. Notice that at the level of each CPU, compilers and microcode achieve already a lower nesting of parallelism. The same in networks or in hard drives. Over the computer history nested parallelism over increasingly many levels has proven to be the way to proceed when codes become increasingly complex. Dan
- Previous message: [Beowulf] multi-threading vs. MPI
- Next message: [Beowulf] multi-threading vs. MPI
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
