[Beowulf] Really efficient MPIs??
Michael H. Frese
Michael.Frese at NumerEx.com
Wed Nov 28 07:06:29 PST 2007
>At 10:31 PM 11/27/2007, you wrote:
>>Because today the clusters with multicore nodes are quite common
>>and the cores within a node share memory.
>>Which Implementations of MPI (no matter commercial or free), make
>>automatic and efficient use of shared memory for message passing
>>within a node. (means which MPI librarries auomatically communicate
>>over shared memory instead of interconnect on the same node).
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit
>The latest MPICH2 from Argonne (may be version 1.06) complied for
>the ch3:nemesis shared memory device has very low latency -- as low
>as 0.06 microseconds -- and very high bandwidth. It beats LAM in
>Argonne's tests. Here are details:
>We are getting higher latencies than that on various hardware, so
Oops, sorry. Early morning typing-while-sleeping.
The latencies claimed by Argonne for core-to-core
on-board communication with MPICH2 compiled using the ch3:nemesis
device are 0.3-0.5 microseconds, not 0.06. There's also no claim
about what happens when you use it for mixed on-board and off-board comms.
Our recent dual-core 64-bit AMD boards get 0.6 microsecond latency
core-to-core, while our older 32-bit ones get 1.6. That's all by netpipe test.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf