Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Q: IB message rate & large core counts (per node)?

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Patrick Geoffray patrick at myri.com
Tue Feb 23 14:10:29 PST 2010


Brian,

On 2/19/2010 1:25 PM, Brian Dobbins wrote:
> the IB cards.  With a 4-socket node having between 32 and 48 cores, lots
> of computing can get done fast, possibly stressing the network.
>
>    I know Qlogic has made a big deal about the InfiniPath adapter's
> extremely good message rate in the past... is this still an important
> issue?  How do the latest Mellanox adapters compare?

I have been quite vocal in the past against the merit of high packet 
rate, but I have learned to appreciate it. There is a set of 
applications that can benefit from it, especially at scale. Actually, 
packet rate is much more important outside of HPC (where application 
throughput is what money buys).

However, I would pay attention to a different problem with many-core 
machines. Each user-space process uses a dedicated set of NIC resources, 
and this can be a problem with 48 cores per node (it affects all 
vendors, even if they swear otherwise). You may want to consider 
multiple NICs, unless you know that only a subset of the cores are 
communicating through the network (hybrid MPI/Open-MP model for example) 
or that the multiplexing overhead is not a big deal for you.

>    On a similar note, does a dual-port card provide an increase in
> on-card processing, or 'just' another link?  (The increased bandwidth is
> certainly nice, even in a flat switched network, I'm sure!)

You need PCIe Gen2 x16 to saturate a 32 Gb/s QDR link. There is no such 
NIC on the market AFAIK (only Gen1 x16 or Gen2 x8). But even then, you 
won't have any PCIe bandwidth left to drive a second port on the same 
NIC. There may be other rationales for a second port, but bandwidth is 
not one of them.

Patrick



More information about the Beowulf mailing list