Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Lowered latency with multi-rail IB?

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Nifty Tom Mitchell niftyompi at niftyegg.com
Thu Mar 26 22:20:18 PDT 2009


On Thu, Mar 26, 2009 at 09:03:30PM -0700, Greg Lindahl wrote:
> On Thu, Mar 26, 2009 at 11:32:23PM -0400, Dow Hurst DPHURST wrote:
> 
> > We've got a couple of weeks max to finalize spec'ing a new cluster.  Has 
> > anyone knowledge of lowering latency for NAMD by implementing a 
> > multi-rail IB solution using MVAPICH or Intel's MPI?
> 
> Multi-rail is likely to increase latency.
> 
> BTW, Intel MPI usually has higher latency than other MPI
> implementations.
> 
> If you look around for benchmarks you'll find that QLogic InfiniPath
> does quite well on NAMD and friends, compared to that other brand of
> InfiniBand adaptor. For example, at
> 
> http://www.ks.uiuc.edu/Research/namd/performance.html
> 
> the lowest line == best performance is InfiniPath. Those results
> aren't the most recent, but I'd bet that the current generation of
> adaptors has the same situation.

What this implies is that NAMD is not purely
bandwidth limited.   Rather it is limited by
other quickness issues.  For the most
part multi-rail is a bandwidth enhancement play...

With multi-rail do double check the system bus (PCI-e) 
bandwidth.   If multi-rail is used determine how the 
data is mux-ed between rails and what the impact of that
decision code path has on quickness and/or bandwidth.

If multi-rail is to go very fast MPI needs to 
manage each rail/LID in productive ways for the
application.  I doubt that this "productive way" has
a simple general one size fits all answer.

NAMD is clearly a "got to benchmark it" application!
Both the data link hardware and the MPI library integration with that
hardware are important...  

The last table on Greg's URI pointer -- NAMD version is also important!
It is possible that NAMD.next will move to be more bandwidth limited
than it is today and then the notion of best interconnect/ platform
will change.


> -- Greg
> (yeah, I used to work for QLogic.)
Me too.

Later,
mitch



-- 
	T o m  M i t c h e l l 
	Found me a new hat, now what?





More information about the Beowulf mailing list