Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Performance characterising a HPC application

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Scott Atchley atchley at myri.com
Fri Mar 30 18:31:29 PDT 2007


On Mar 26, 2007, at 1:04 PM, Gilad Shainer wrote:

> When Mellanox refers to transport offload, it mean full transport
> offload - for all transport semantics. InfiniBand, as you probably
> know, provides RDMA AND Send/Receive semantics, and in both cases
> you can do Zero-copy operations.
>
> This full flexibility provides the programmer with the ability to  
> choose
> the
> best semantics for his use. Some programmers choose Send/Receive and
> some RDMA. It is all depends on their application.
> From your response, I see that Qlogic does not provide this kind
> of flexibility.

Gilad,

I have seen you make that point many times. This may be a silly  
question, but it latency and throughput equivalent for both APIs for  
large and small messages?

I ask because I wrote the ports of Lustre and PVFS2 for MX and I  
spent a lot of time looking at their existing IB code. I see them use  
Send/Recv for small and/or unexpected messages. Both use IB write for  
large payloads.

Although both use IB write (one-sided, no?) for the large payload,  
both require one or two small Send/Recv messages to serve as RTS and  
CTS before they can initiate the one-sided implementation. In effect,  
they have to write their own Send/Recv (two-sided) semantics on of  
IB's RDMA.

If Send/Recv performance is on par with RDMA on IB, why not use that  
API for large messages? Why re-write Send/Recv every time they use  
RDMA? The code to implement PVFS2 on MX is over 30% smaller than the  
IB code because I did not have to re-write Send/Recv.

Scott



More information about the Beowulf mailing list