Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Correct networking solution for 16-core nodes

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Greg Lindahl greg.lindahl at qlogic.com
Thu Aug 3 12:53:40 PDT 2006


On Thu, Aug 03, 2006 at 11:19:44AM +0200, Joachim Worringen wrote:

> From the numbers published by Pathscale, it seems that the simple MPI 
> latency of Infinipath is about the same whether you go via PCIe or HTX. 
> The application perfomance might be different, though.

No, our published number is 1.29 usec for HTX and 1.6-2.0 usec for PCI
Express. It's the message rate that's about the same.

BTW there are more HTX motherboards appearing: the 3 IBM rack-mount
Opteron servers announced this Tuesday all have HTX slots:

http://www-03.ibm.com/systems/x/announcements.html

In most HTX motherboards, a riser is used to bring out either HTX or
PCI Express, so you don't have to sacrifice anything. That's why IBM
can put HTX in _all_ of their boxes even if most won't need it,
because it doesn't take anything away except a little board space. The
existing SuperMicro boards work like this, too.

Vincent wrote:

> Only quadrics is clear about its switch latency (probably
> competitors have a worse one). It's 50 us for 1 card.

We have clearly stated that the Mellanox switch is around 200 usec per
hop.  Myricom's number is also well known.

Mark Hahn wrote:

> I intuit (totally without rigor!) that fatter nodes do increase bandwidth
> needs, but don't necessarily change the latency picture.

Fatter nodes mean more cpus are simultaneously trying to send out
messages, so yes, there is an effect, but it's not quite latency: it's
that message rate thing that I keep on talking about.

http://www.pathscale.com/performance/InfiniPath/mpi_multibw/mpi_multibw.html

Poor scaling as nodes get faster are the dirty little secret of our
community; our standard microbenchmarks don't explore this, but
today's typical nodes have 4 or more cores.

-- greg




More information about the Beowulf mailing list