[Beowulf] Performance characterising a HPC application

Greg Lindahl greg.lindahl at qlogic.com
Mon Mar 26 08:21:28 PDT 2007

On Fri, Mar 23, 2007 at 11:53:14AM -0700, Gilad Shainer wrote:

> and not only on pure latency or pure bandwidth. Qlogic till recently (*)
> had the lowest latency number but when it comes to application, the CPU
> overhead is too high.

QLogic's overhead is lower than Mellanox, how low do you want it to be?

Please see http://www.pathscale.com/pdf/overhead_and_app_perf.pdf

This shows the MPI overhead per byte transferred, as measured by Doug
at Sandia in 2005. How can ours be lower? The InfiniBand APIs are
unnecessarily complicated, and are a poor match to MPI compared to
everyone elses APIs: MX, Tports, InfiniPath's PSM.

The next slide shows a graph of the LS-Dyna results recently submitted
to topcrunch.org, showing that InfiniPath SDR beats Mellanox DDR on
the neon_refined_revised problem, both running on 3.0 Ghz Woodcrest
dual/dual nodes.

I look forward to showing the same advantage vs ConnectX, whenever
it's actually available.

-- greg

(p.s. ggv has issues with the fonts in this pdf, so try xpdf or (yech)
acroread instead.)

More information about the Beowulf mailing list