[Beowulf] many cores and ib
patrick at myri.com
Tue May 6 01:46:16 PDT 2008
Gilad Shainer wrote:
> It is the same benchmark that QLogic were and are using for MPI message
> rate, and I guess you know that better then me, don't you?.... I want
> to make sure when one do a comparison he/she will be using the same
> benchmark/output to compare.
It is not the benchmark, it's the MPI implementation. The benchmark in
itself is stupid, because it sends a gazillion messages to a single
node. The MPI implementation is dishonest, because it says "eh, you are
trying to send a gazillion messages to a single node, let me pack them
into a single message on the wire for you", completely changing what the
benchmark is trying to measure.
You are a marketing guy, you just repeat the numbers without
understanding what they mean. Message coalescing in MVAPICH does nothing
but make the message rate micro-benchmark irrelevant, it was designed
that way, and only for that purpose. With message coalescing,
*everybody* can send 20 Million messages per second, as long as you have
over 1GB/s of bandwidth.
This is like the header caching "optimization": change the MPI tag for
each Send in your pingpong benchmark, and see your latency goes up. It's
because the MPI implementation is smart enough to say "eh, you are
sending the same message envelope over and over, let me compact the MPI
header for you". It does not help anything but a micro-benchmark.
I can imagine the next optimization from here: if you happen to send
messages full of zeros in your ping-pong, MVAPICH will "compress" them
for you. And somewhere, someone will claim a gazillion bytes per second...
More information about the Beowulf