[Beowulf] New HPCC results, and an MX question
lindahl at pathscale.com
Tue Jul 19 19:27:01 PDT 2005
On Tue, Jul 19, 2005 at 10:12:22PM -0400, Patrick Geoffray wrote:
> > interconnects, you'll note that many of them get a much worse random
> > ring latency than ordinary ping-pong.
> Nope. It's worse because:
> * they use much larger clusters: when the size of the cluster increases,
> the number of hops increases, thus the worst case latency increases. 16
> nodes is a tiny cluster with just one hop worst case.
> * they use older hardware: 2.6 GHz Opterons are not very old.
> * they use older drivers: because customers have other things to do that
> running benchmark on carefuly crafted environment with carefuly
> optimized driver/lib.
I am referring to a comparison of the HPCC "random ring latency" to
the HPCC "average ping-pong" on the same hardware, with the same
driver, at the same cluster size. I was not referring to the absolute
numbers, which of course are dependent on cluster size, host cpu
clock, and driver version.
> By the way, could you point me to the raw performance data on the
> pathscale web pages ?
As I said, it is in the process of being published, and I attached
the relevant info to my posting.
> >published the raw data, but they did publish graphs. The claimed
> >0-byte latency is 2.6 usec, with no explanation of what benchmark was
> >used. The graph at:
> From the page: "Performance data is presented for the Pallas MPI
> Benchmark Suite, Version 2.2". It's in bold, but maybe we should write
> in red, blinking...
I was referring to the 2.6 usec claim at:
That page makes no reference to Pallas. The page you're referring to is
which doesn't include a 2.6 usec claim, but does say that it's Pallas
> Anyway, the cluster I ran Pallas on had a 0-byte MPI latency of 2.9 us.
> Why ? Because it's a production cluster, deployed over a year ago, with
> 1.4 GHz Opteron CPUs (compare that with your 2.6 GHz).
Thank you for the number. Does your latency change significantly with
faster cpus? Ours does (from 1.50 usec at 2.0 Ghz to 1.32 usec at 2.6
Ghz), but my impression was that your number ought to be relatively
insensitive to the host cpu speed.
More information about the Beowulf