[Beowulf] Correct networking solution for 16-core nodes
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Greg Lindahl greg.lindahl at qlogic.comThu Aug 3 12:53:40 PDT 2006
- Previous message: [Beowulf] Correct networking solution for 16-core nodes
- Next message: [Beowulf] Correct networking solution for 16-core nodes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, Aug 03, 2006 at 11:19:44AM +0200, Joachim Worringen wrote: > From the numbers published by Pathscale, it seems that the simple MPI > latency of Infinipath is about the same whether you go via PCIe or HTX. > The application perfomance might be different, though. No, our published number is 1.29 usec for HTX and 1.6-2.0 usec for PCI Express. It's the message rate that's about the same. BTW there are more HTX motherboards appearing: the 3 IBM rack-mount Opteron servers announced this Tuesday all have HTX slots: http://www-03.ibm.com/systems/x/announcements.html In most HTX motherboards, a riser is used to bring out either HTX or PCI Express, so you don't have to sacrifice anything. That's why IBM can put HTX in _all_ of their boxes even if most won't need it, because it doesn't take anything away except a little board space. The existing SuperMicro boards work like this, too. Vincent wrote: > Only quadrics is clear about its switch latency (probably > competitors have a worse one). It's 50 us for 1 card. We have clearly stated that the Mellanox switch is around 200 usec per hop. Myricom's number is also well known. Mark Hahn wrote: > I intuit (totally without rigor!) that fatter nodes do increase bandwidth > needs, but don't necessarily change the latency picture. Fatter nodes mean more cpus are simultaneously trying to send out messages, so yes, there is an effect, but it's not quite latency: it's that message rate thing that I keep on talking about. http://www.pathscale.com/performance/InfiniPath/mpi_multibw/mpi_multibw.html Poor scaling as nodes get faster are the dirty little secret of our community; our standard microbenchmarks don't explore this, but today's typical nodes have 4 or more cores. -- greg
- Previous message: [Beowulf] Correct networking solution for 16-core nodes
- Next message: [Beowulf] Correct networking solution for 16-core nodes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
