[Beowulf] Questions regarding interconnects
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Patrick Geoffray patrick at myri.comFri Mar 25 13:02:24 PST 2005
- Previous message: [Beowulf] Questions regarding interconnects
- Next message: [Beowulf] Bernhard Kuhn's real time interrupt patch
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Vincent, Vincent Diepeveen wrote: > I feel very important to look at is 'shmem' capabilities. > In order for B to receive, it has to have either a special thread > that regurarly polls. If you have a thread that polls say each 10 > milliseconds, then what's the use of using a highend network > card (other than it's DMA capabilities)? You are in a situation where you don't have to wait for the message to arrive, you can move on and check 10 ms later. In this case, you don't care about network speed. > However, it's very expensive to poll. No, it's not. No in the OS-bypass world. > On the other hand using the 'shmem', what happens is that A ships a > nonblocking write to B of just a few bytes. The network card in B simply > writes it in the RAM. > > Now and then the searching process at B only has to poll its own main > memory to see whether it has a '1'. So sometimes you lose a TLB trashing > call to it, but other times it comes from L2 cache. It's still polling. With message passing, you actually poll a queue in the MPI lib instead of a specific location in the user application. That helps when you are looking for several messages from several sources (got to poll several locations in you model). > So for short messages which are latency sensitive that 'shmem' of quadrics > is just far superior. You are getting confused with words. "SHMEM" is a legacy shared memory interface that was used on Cray machines like the T3D. It's not a standard per se, it's a software interface. The implementations usually rest on top of remote memory operations (PUT/GET). It always stike mean when people put "one-sided" and "latency sensitive" in the same sentence. "one-sided" means that you don't want to involve the remote side in the communication and "latency sensitive" means the other side is waiting for the communication. In your example, you will be looking if someone has written in your memory every X ms. In this case, what do you care about latency ? > Do other cards implement something similar? You can do PUT on most high speed networks, this is a pretty basic functionality. The SHMEM interface may not be used because it makes sense only for former Cray customers, but look for portable RMA implementations like ARMCI for example. > As far as i know they do not. Do more research. > The overhead of the MPI implementation layer *receiving* bytes is just so > so huge. A cards theoretic one-way pingpong latency is just irrelevant to > that, because that one way pingpong programs at all cards is eating 100% > system time, effectively losing a full cpu. You are mistaken about the MPI receive overhead. You are also mistaken in your belief than one-sided operations are the Silver bullets. RMA operations may be more appropriate to an application design, but it shares many constraints with message passing: you have to poll to know when it's done, you have to tell the other side where to write (equivalent to posting a recv). It has drawbacks like usually not scaling in space (each sender should write to a different location). Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com
- Previous message: [Beowulf] Questions regarding interconnects
- Next message: [Beowulf] Bernhard Kuhn's real time interrupt patch
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
