[Beowulf] Performance characterising a HPC application

Mark Hahn hahn at mcmaster.ca
Mon Mar 26 16:08:52 PDT 2007

>> obviously, there are
>> many applications which have absolutely no use for bandwidth
>> greater than even plain old gigabit.
>> equally obvious, there are others which are sensitive to
>> small-packet latency, which is not affected by DDR or dual-rail.
> Yes, there are application that don't utilize the interconnect and

perhaps we're having a language problem here.  we're not talking about
apps which don't use IC (at all).

> will show the same results on 1GigE and 100GigE. I was referring to the
> application that do utilize the interconnect, and those are the ones
> that I test.

of all apps which are nontrivially parallel (that is, which do use IC),
the question is how many are responsive to very high bandwidth IC.
of my organizations 1500 users, across most all disciplines, not many
complain about our mid-end IC (elan3 or myri 2g, both around 250 MB/s.).
some do report noticable speedup moving from myri 2g to elan4, but 
as far as we can tell, that's mainly latency (~7 us GM vs 1.3 or so.)

my main point is that there has to be some kind of 80/20 rule here,
whether it's 90/10 or 70/30.  I'm sure there are people who desperately
want dual-rail DDR IB, and perhaps their code is even sane (I'd like to 
be shown, rather than just take someone's word on it.)  but some large 
fraction of people do very nicely on plain-old-gigabit (nontrivial n-body
astro codes); some smallish subset respond nicely to myri2G/elan3, which
is ~6x better latency and ~3x better bandwidth.  of those, an even smaller
subset respond to elan4 (5x better latency, 4x better bandwidth).

obviously, if _your_ code is one of the subset of subset of subset
which are unleashed by 20 Gb of bandwidth, all this is irrelevant.
I'm just curious to hear about this category of BW-monsters.

> Anyway, the I/O is not just bandwidth, but also provides latency, CPU
> overhead
> and other important characteristics, and all of them need to be
> considered.

duh.  and cable weight, length, bend radius, alien interfernce, licensing
costs, managability, failure management, etc.

More information about the Beowulf mailing list