[Beowulf] new release of GAMMA: x86-64 + PRO/1000
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at physics.mcmaster.caSat Feb 4 10:51:29 PST 2006
- Previous message: [Beowulf] new release of GAMMA: x86-64 + PRO/1000
- Next message: [Beowulf] new release of GAMMA: x86-64 + PRO/1000
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> GAMMA only supports the Intel PRO/1000 Gigabit Ethernet NIC (e1000 driver). well, that's the sticking point, isn't it? is there any way that GAMMA could be converted to use the netpoll interface? for instance, look at drivers/net/netconsole.c which is, admittedly, much less ambitious than supporting MPI. > Latency at MPI level is below 12 usec, switch included (6.5 usec back-to-back). it would be interesting to know the latency of various GE switches - I believe quite a number of them now brag 1-2 us latency. > Does it still make sense to have a low-latency communication library for > Gigabit Ethernet? I certainly think so, since IMO not much has changed in some sectors of the cluster-config space. AFAIKT, per-port prices for IB (incl cable+switch) have not come down anywhere near GB, or even GB prices from ~3 years ago, when it was still slightly early-adopter. my main question is: what's the right design? I've browsed the gamma patches a couple times, and they seem very invasive and nic-specific. is there really no way to avoid this? for instance, where does the latency benefit come from - avoiding the softint and/or stack overhead, or the use of a dedicated trap, or copy-avoidance? further, would Van Jacobson's "channels" concept help out here? http://www.lemis.com/grog/Documentation/vj/lca06vj.pdf channels are a "get the kernel out of the way" approach, which I think makes huge amounts of sense. in a way, InfiniPath (certainly the most interesting thing to happen to clusters in years!) is a related effort, since it specifically avoids the baroqueness of IB kernel drivers. the slides above basically provide a way for a user-level TCP library to register hooks (presumably the usual <IP:port>) for near-zero-overhead delivery of packets (and some kind of outgoing queue as well). the results are quite profound - their test load consumed 77% CPU before, and 14% after, as well as improving latency by ~40%. yes, it's true that if you spend the money, you can get much better performance with less effort (for instance, quadrics is just about the ultimate throw-money solution, with InfiniPath similar in performance but much more cost-effective.) but gigabit is just so damn cheap! tossing two 48pt switches at 72 $1500 dual-opt servers in a FNN config and bang, you've got something useful, and you don't have to confine yourself to seti at home-levels of coupling. regards, mark hahn.
- Previous message: [Beowulf] new release of GAMMA: x86-64 + PRO/1000
- Next message: [Beowulf] new release of GAMMA: x86-64 + PRO/1000
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
