[Beowulf] multi-threading vs. MPI
James.P.Lux at jpl.nasa.gov
Tue Dec 11 10:30:25 PST 2007
At 05:17 AM 12/11/2007, Douglas Eadline wrote:
>This is indeed the issue. Where to invest time?
>My opinion, and it is only my opinion, is the following.
>Please share your own.
>Threaded approaches do not scale across clusters. The memory
>architecture of multi-core is making nodes look more like
>small clusters i.e. memory is becoming more localized.
>As Don Becker mentioned in a recent post, efforts to program
>distributed memory like it were shared memory often end
>up looking like stylized message passing systems.
>One other thing about messages. The problem of
>trying to optimize the compute to communication issue is
>easier than trying to optimize the compute to locality
>Therefore, if I were to start a new parallel project of some sort
>or parallelize an existing code, I would use MPI. Although
>OpenMP might get me up and running quicker, I would feel more
>comfortable with a problem cast in MPI.
>I'm interested in others opinions on this because, I think it
>is an important issue for the general programing audience
>and not just us cluster geeks. The difference is we have had
>a lot more time and experience with this stuff.
Another huge advantage of going to a message passing paradigm is that
it forces you to explicitly deal with the time synchronization (or
lack thereof) among processes in that an underlying assumption is
that passing the message takes non-zero time. Therefore, in any
message passing system, there's not necessarily any concept of
"absolute time" among all processes. (You have to pass time
messages, just like any other).
As the propagation delay (light time) among processors gets to be a
significant fraction of the message length, this is a bigger and bigger deal.
For myself, this is an issue because I work with systems that are
distributed over huge distances (where light time is seconds or
minutes and it varies), but it also applies on a finer grain where
you have delays in the communications paths in the
microseconds/milliseconds scale, especially if they are variable and
non-deterministic. (NTP, for instance, assumes that the delays are
deterministic in the long term sense, even if there's a lot of short
More information about the Beowulf