[Beowulf] Network considerations for new generation cheap beowulfcluster

Jess Cannata jac67 at georgetown.edu
Mon May 21 07:11:36 PDT 2007


Mark Hahn wrote:
>> I agree that all of the options (Infiniband, Myrinet, and 10 Gb 
>> Ethernet) are too expensive.
>
> I'm curious what kinds of costs you're seeing (per-port) for each of 
> these.
>
By too expensive, I mean much more expensive than Gig-E which is "free" 
on the NIC side and quite cheap on the switch side.

>> I have been looking into the low latency 10 Gb Ethernet cards from 
>> NetEffect, which use the iWARP specifications to provide low latency. I
>
> why do you think iWARP is necessary to provide low latency?
I'll admit that I don't have a great understanding how the NetEffect 
cards work. I do know that they are using the iWARP protocol (Remote 
Direct Memory Access, etc.) to reduce latency, but that isn't the only 
thing they are using.
>
>> haven't done any testing, yet, but the numbers that they are 
>> releasing show them competitive with Infiniband/Myrinet as the number 
>> of processes increase.
>
> do you mean as you increase number of processes on a single node (that 
> is,
> sharing a single interconnect port), or number of processes in the 
> whole job or cluster? 
What I should have said is that the NetEffect card is competitive as the 
number of network connections to a process on a single node increases. 
Unfortunately, the HPCWire article did not include these numbers. I saw 
them is a presentation given by NetEffect.

>
>> Plus, I expect 10 Gb switches to rapidly drop in price. I believe 
>> that the
>
> I hope for that as well, but am not sanguine.  expensive optic 
> tranceivers preclude commoditization of small (~20 pt) switches, and 
> I've heard people say bad things about the practicality of 
> mass-produced/cheap 10G-baseT.
> (mainly complaining about complexity and power requirements.)
I heard the same things for Gig-E. I'm confident a solution will be 
found either through better manufacturing, design, or new technology.
>
>
>> and post some numbers. Here is a link to some of the numbers that 
>> NetEffect is publishing:
>>
>> http://www.hpcwire.com/hpc/716435.html
>
> no usable latency numbers there.  if you squint, it looks like they're
> claiming latency of around 7 us, which is _not_ competitive with even 
> myri 2G (nor recent IB nor myri 10G.)
The numbers that I saw are not on HPCWire. I didn't realize that when I 
sent the link. I recommend that people check out the NetEffect cards and 
similar interconnects (low latency 10 Gb Ethernet with iWARP) to see if 
the claims that they make are valid. I'm not sure that they are, but I 
am interested to find out. AMD's developer site has a new cluster with 
both Infiniband and NetEffect's low latency 10 Gb cards installed so you 
will be able to do direct comparisons between low latency 10 Gb Ethernet 
and Infiniband. It is called "Smith." You can find it at 
https://devcenter.amd.com/about/systems.php. I haven't tested it, yet, 
but I plan to. I'd be interested in hearing about other's experience.

>
>>> the cheapest cable i see is 1 meter and $70
>
> nothing wrong with $70 cables - you need to quote the whole per-port 
> price,
> including nic, cable and switch port.  it looks to me as if Myri 10G 
> is around $1500/port; I've never had a good read on IB prices 
> (deconvolved from vendor/discount pricing issues.)
>
>>> Cheapest card i see is $715
>
> nothing wrong with $715, even if the all-in per-port price is $1500 - 
> it just means you won't be using $1000 desktop-spec nodes.  that's OK,
> since if you're worried about ~3 us latency and 1GB bandwidth, you
> should also be using multiple cores, ECC memory, and probably a few 
> GB/node,
> and therefore can easily amortize $715/node.
>
>>> So the node price starts at $765, which is already way way more than 
>>> the total price of 1 node.
>
> only if you're looking at extremely low-end nodes.  for such nodes,
> the only viable option is zero-cost Gb nics, of course, and 
> mass-market switches (ie, not high-end chassis switches, etc).
>
I thought--though I may be mistaken--that this was the point of the 
original post. What is/will be the new low-cost network solution? It 
doesn't seem to be Infiniband or Myri-10G since their price doesn't seem 
to be dropping much.
> 5 years ago, the low-end approach was 100bT; now its 1000bT.  the 
> prime target for that approach (serial or EP) has simply gotten broader;
> I don't see this as anything to complain about.  for "real" parallel,
> you have to pay for the network you need.  there as well, you now get 
> more for your money, no complaints.  complaining that you can't get 1 
> us, 1GBps interconnect for $50/port is just silliness.
>
I disagree on this last point. Why can't low latency interconnects 
become the standard? It is not just HPC applications that are demanding 
low latency networks.

Jess



More information about the Beowulf mailing list