[Beowulf] new release of GAMMA: x86-64 + PRO/1000

Sat Feb 4 10:51:29 PST 2006

> GAMMA only supports the Intel PRO/1000 Gigabit Ethernet NIC (e1000 driver).

well, that's the sticking point, isn't it?  is there any way that GAMMA
could be converted to use the netpoll interface?  for instance, look at 
drivers/net/netconsole.c which is, admittedly, much less ambitious 
than supporting MPI.

> Latency at MPI level is below 12 usec, switch included (6.5 usec back-to-back).

it would be interesting to know the latency of various GE switches - 
I believe quite a number of them now brag 1-2 us latency.

> Does it still make sense to have a low-latency communication library for
> Gigabit Ethernet?

I certainly think so, since IMO not much has changed in some sectors of 
the cluster-config space.  AFAIKT, per-port prices for IB (incl cable+switch)
have not come down anywhere near GB, or even GB prices from ~3 years ago,
when it was still slightly early-adopter.

my main question is: what's the right design?  I've browsed the gamma 
patches a couple times, and they seem very invasive and nic-specific.
is there really no way to avoid this?  for instance, where does the 
latency benefit come from - avoiding the softint and/or stack overhead,
or the use of a dedicated trap, or copy-avoidance?

further, would Van Jacobson's "channels" concept help out here?
http://www.lemis.com/grog/Documentation/vj/lca06vj.pdf

channels are a "get the kernel out of the way" approach, which I think 
makes huge amounts of sense.  in a way, InfiniPath (certainly the most 
interesting thing to happen to clusters in years!) is a related effort,
since it specifically avoids the baroqueness of IB kernel drivers.

the slides above basically provide a way for a user-level TCP library
to register hooks (presumably the usual <IP:port>) for near-zero-overhead
delivery of packets (and some kind of outgoing queue as well).  the 
results are quite profound - their test load consumed 77% CPU before,
and 14% after, as well as improving latency by ~40%.

yes, it's true that if you spend the money, you can get much better 
performance with less effort (for instance, quadrics is just about 
the ultimate throw-money solution, with InfiniPath similar in performance
but much more cost-effective.)  

but gigabit is just so damn cheap!  tossing two 48pt switches at 72 $1500
dual-opt servers in a FNN config and bang, you've got something useful,
and you don't have to confine yourself to seti at home-levels of coupling.

regards, mark hahn.