[Beowulf] Questions regarding interconnects
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at physics.mcmaster.caThu Mar 24 21:47:45 PST 2005
- Previous message: [Beowulf] Questions regarding interconnects
- Next message: [Beowulf] Questions regarding interconnects
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> > What do you see as the key differentiating factors in the quality of an > > MPI implementation? This far I have come up with the following: > > -Completeness of the implementation this really depends on the "maturity" of the application. I know of one application which has covered a lot of ground, including cray-shm, openmp and mpi-2 (with heavy use of post-mpi-1 features.) it cares about completeness, but a new app written from scratch doesn't. > > -Latency/bandwidth it would be hard to argue that these don't matter. as Greg points out, zero-byte latency and infinite-byte bandwidth don't necessarily predict the performance that real-app-sized packets will see. then again, if a more accurate prediction were desired, just fit three lines to the s-curve. that's the appeal of quoting half-bandwidth packet size anyway, isn't it? > > -Asynchronous communication this appeals because people recognize that it *could* provide higher performance. it seems like most implementations are fairly disappointing in how they implement asynchrony, but that's not a reason to ignore asynchrony present in your program. > > -Smart collective communication this would appeal more widely if there was hardware support that gave a real speedup (as in Quadrics) rather than shifting code from app-space to library-space. in other words, people care less about convenience functions. libmpi.a may do a wonderful O(nlogn) bcast, but it would be a lot sexier if the interconnect provided hardware acceleration. > Likewise, people want asynchronous communication because they imagine > that it will give them better performance. I think there's more to it than that. any programmer notices when there are dependencies and when there is slack. if there was a smart MPI/interconnect coprocessor, taking advantage of the slack would turn asynchrony into better performance - basically latency hiding. > > When do you estimate that commodity Gigabit NICs with integrated RDMA > > support will arrive to the market? (or will they?) > > They arrived a while ago, didn't seem to make much of a splash. I don't > personally think much of offload. TOE folk don't seem to understand the concept of fast-paths. sure, RDMA is attractive, but does that mean the whole TCP stack (plus some new extra RDMA gunk) needs to go onto the nic? suppose you had a nic which could generate packets in response to very specific filters on incoming packets. in other words, "reflex" responses to the expected state transitions, avoiding host involvement if the pattern is as expected. of course, it's also true that TCP has very little justification in a cluster setting, so what's TOE for? trying to run really giant webservers on a single K6-2? most internet-related TCP services can be quite readily clusterized in the first place, so scaling is not a problem. one could easily argue that network state machines have shown far less innovation and paradigm shift than graphics accelerators. and look at the awesome amount of offload in your video card - it could easily have more transistors and flops than your host cpu. as far as I can tell, this argument only fails because the mass market is not anywhere close to being net-bottlenecked, and that it's harder to throw hardware at networking. it's easy to be limited by graphics (turn up the resolution, framerate, quality, AA, etc), and it's easy to throw another dozen pixel pipelines at the problem. imagine if you had an interconnect coprocessor with 220M transistors and 30 GB/s private memory bandwidth sitting on 16x PCI-E. the only think I can think of to use that horsepower for would be a distributed directory-based shared-memory scheme that implemented FP collectives...
- Previous message: [Beowulf] Questions regarding interconnects
- Next message: [Beowulf] Questions regarding interconnects
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
