[Beowulf] Re: Re: Home beowulf - NIC latencies

Rob Ross rross at mcs.anl.gov
Fri Feb 11 18:47:22 PST 2005


Hi Isaac,

On Fri, 11 Feb 2005, Isaac Dooley wrote:

> >>Using MPI_ISend() allows programs to not waste CPU cycles waiting on the
> >>completion of a message transaction.
> >
> >No, it allows the programmer to express that it wants to send a message 
> >but not wait for it to complete right now.  The API doesn't specify the 
> >semantics of CPU utilization.  It cannot, because the API doesn't have 
> >knowledge of the hardware that will be used in the implementation.
> >
> That is partially true.  The context for my comment was under your 
> assumption that everyone uses MPI_Send(). These people, as I stated 
> before, do not care about what the CPU does during their blocking calls.

I think that it is completely true.  I made no assumption about everyone 
using MPI_Send(); I'm a late-comer to the conversation. 

I was not trying to say anything about what people making the calls care
about; I was trying to clarify what the standard does and does not say.  
However, I agree with you that it is unlikely that someone calling
MPI_Send() is too worried about what the CPU utilization is during the
call.

> I was trying to point out that programs utilizing non-blocking IO may 
> have work that will be adversely impacted by CPU utilization for 
> messaging. These are the people who care about CPU utilization for 
> messaging. This I hopes answers your prior question, at least partially.

I agree that people using MPI_Isend() and related non-blocking operations 
are sometimes doing so because they would like to perform some 
computation while the communication progresses.  People also use these 
calls to initiate a collection of point-to-point operations before 
waiting, so that multiple communications may proceed in parallel.  The 
implementation has no way of really knowing which of these is the case.

Greg just pointed out that for small messages most implementations will do
the exact same thing as in the MPI_Send() case anyway.  For large messages
I suppose that something different could be done.  In our implementation
(MPICH2), to my knowledge we do not differentiate.

You should understand that the way MPI implementations are measured is by 
their performance, not CPU utilization, so there is pressure to push the 
former as much as possible at the expense of the latter.

> Perhaps your applications demand low latency with no concern for the CPU 
> during the time spent blocking. That is fine. But some applications 
> benefit from overlapping computation and communication, and the cycles 
> not wasted by the CPU on communication can be used productively.

I wouldn't categorize the cycles spent on communication as "wasted"; it's 
not like we code in extraneous math just to keep the CPU pegged :).

Regards,

Rob
---
Rob Ross, Mathematics and Computer Science Division, Argonne National Lab




More information about the Beowulf mailing list