[Beowulf] Performance characterising a HPC application
greg.lindahl at qlogic.com
Mon Mar 26 08:21:28 PDT 2007
On Fri, Mar 23, 2007 at 11:53:14AM -0700, Gilad Shainer wrote:
> and not only on pure latency or pure bandwidth. Qlogic till recently (*)
> had the lowest latency number but when it comes to application, the CPU
> overhead is too high.
QLogic's overhead is lower than Mellanox, how low do you want it to be?
Please see http://www.pathscale.com/pdf/overhead_and_app_perf.pdf
This shows the MPI overhead per byte transferred, as measured by Doug
at Sandia in 2005. How can ours be lower? The InfiniBand APIs are
unnecessarily complicated, and are a poor match to MPI compared to
everyone elses APIs: MX, Tports, InfiniPath's PSM.
The next slide shows a graph of the LS-Dyna results recently submitted
to topcrunch.org, showing that InfiniPath SDR beats Mellanox DDR on
the neon_refined_revised problem, both running on 3.0 Ghz Woodcrest
I look forward to showing the same advantage vs ConnectX, whenever
it's actually available.
(p.s. ggv has issues with the fonts in this pdf, so try xpdf or (yech)
More information about the Beowulf