[Beowulf] Re: Re: Home beowulf - NIC latencies
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Patrick Geoffray patrick at myri.comMon Feb 14 23:48:47 PST 2005
- Previous message: [Beowulf] Re: Re: Home beowulf - NIC latencies
- Next message: [Beowulf] Re: Re: Home beowulf - NIC latencies
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Rob, Rob Ross wrote: > The last two Scalable OS workshops (the only two I've had a chance to > attend), there was a contingent of people that are certain that MPI isn't > going to last too much longer as a programming model for very large Were they advocating shared memory paradigms, one sided operations, something more "natural" to program with ? I heard that before :-) > systems. The issue, as they see it, is that MPI simply imposes too much > latency on communication, and because we (as MPI implementors) cannot > decrease that latency fast enough to keep up with processor improvements, > MPI will soon become too expensive to be of use on these systems. This is just wrong. How much of the latency in high speed interconnect is due to MPI ? Very very little. The core of it is in the hardare (IO bus, NICs, crossbars and wires). Doing pure RDMA in hardware is easy for the chip designers, but it's hell for irregular applications when you actually don't know where to remotely read or write. > Also, there is additional overhead in the Isend()/Wait() pair over the > simple Send() (two function calls rather than one, allocation of a Request > structure at the least) that means that a naive attempt at overlapping > communication and computation will result in a slower application. So > that doesn't surprise me at all. What is the cost of one function call and an allocation in a slab ? At several GHz, 50 ns ? And most of the time, blocking calls are implemented on top of non-blocking routines, so the CPU overhead is the same. > I think that the theme from this thread should be that "it's a good thing > that we have more than one MPI implementation, because they all do > different things best." I would say having more than one MPI implementations is a bad thing as long as you cannot easily replace one by another. Let's define a standard MPI header and a standard API for spawning and such, and then having more than one implementation will actually be manageable. That would also remove the needs for swiss-army-knife MPI implementations that want to support all interconnect with the same binary. These implementations are, IMHO, a bad thing as they work at the lowest common denominator and are in essence inefficient for all devices. While we are at it, here is my wish list for the next MPI specs: a) only non-blocking calls. If there are no blocking calls, nobody will use them. b) non-blocking calls for collectives too, there is no excuse. Yes, even an asynchronous barrier. c) ban of the ANY_SENDER wildcard: a world of optimization goes away with this convenience. d) throw away the user defined datatypes, or at least restrict it to regular strides. e) get rid of one-sided communications: if someone is serious about it, it uses something like ARMCI or UPC or even low level vendor interfaces. Rob, you are politically connected, could you make it happen, please ? :-) Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com
- Previous message: [Beowulf] Re: Re: Home beowulf - NIC latencies
- Next message: [Beowulf] Re: Re: Home beowulf - NIC latencies
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
