[Beowulf] InfiniBand channel bundling?
prentice.bisbal at rutgers.edu
Thu Oct 30 07:16:42 PDT 2014
On 10/29/2014 06:43 PM, Jörg Saßmannshausen wrote:
> Hi all,
> thanks again for the wealth of information.
> Now, given that I am not interested in transporting files over the IB network
> but I am doing parallel calculations, I would have thought that the latency
> here is more important than the speed?
> Thus, if FDR has a higher latency than QDR, does that mean my performance is
> decreasing when I am running a calculation between nodes?
It depends on the size of the messages sent back and forth during the
calculations, and the frequency of communications: is there
communications every time step, every x time steps, etc. Latency affects
the communications time for all messages, it's just more noticeable for
small messages since it represents a larger percentage of the total
For example, if your doing some kind of particle physics code, where
each node gets a volume of space, at the end of each time step, each
node needs to share the updated information about the particles along
it's borders with the it's neighbor nodes on the corresponding borders.
This is known as a 'halo' exchange'. How much data a halo exchange
requires depends on the problem and how finely it's decomposed across
the compute nodes, but I'm sure it can be enough data where the higher
bandwidth of FDR is beneficial.
If your application has a lot of barriers, but little data exchanges
between nodes, latency would be more important, since the size of
barrier messages are very small.
I'm not a big fan of the cliche response 'it depends', but it's cliche
because it does apply to many questions on this list. If FDR is hurting
the performance of your apps, it really depends on the specifics of your
> For those of you who are into Chemistry code: I am using VASP, cp2k, quantum
> espresso and cpmd mainly. All of that is plain wave code.
I'm not familiar enough with the nitty-gritty of any of these codes to
comment on their behavior.
> All the best from a wet London
> On Mittwoch 29 Oktober 2014 Prentice Bisbal wrote:
>> On 10/28/2014 04:43 PM, Mark Hahn wrote:
>>> On Tue, 28 Oct 2014, John Hearns wrote:
>>>> Here is a very good post from Glenn Lockwood regarding FDR versus
>>>> dual-rail QDR:
>>> indeed, very nice. though also quite surprising - is it known that
>>> FDR is so terrible for latency? seems astonishing to me.
>> Yes, it was known to me. I had already known that FDR was worse than QDR
>> for latency, but I don't remember my source. I don't know if I'd
>> characterize it as "so terrible", though.
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Manager of Information Technology
Rutgers Discovery Informatics Institute (RDI2)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf