[Beowulf] bandwidth: who needs it?
ashley at quadrics.com
Wed Oct 20 06:44:49 PDT 2004
On Sat, 2004-10-16 at 22:36, Mark Hahn wrote:
> do you have applications that are pushing the limits of MPI bandwidth?
> for instance, code that actually comes close to using the 8-900 MB/s
> that current high-end interconnect provides?
> we have a fairly wide variety of codes inside SHARCnet, but I haven't
> found anyone who is even complaining about our last-generation fabric
> (quadrics elan3, around 250 MB/s). is it just that we don't have the
> right researchers? I've heard people mutter about earthquake researchers
> being able to pin a 800 MB/s network, and claims that big FFT folk can
> do so as well. by contrast, many people claim to notice improvements
> in latency from old/mundane (6-7 us) to new/good (<2 us).
> I'd be interested in hearing about applications you know of which are
> very sensitive to having large bandwidth (say, .8 GB/s today).
It's not so much that you don't have the right researchers, it's the
type of projects they are researching or at least the way they are
attacking the problem.
Latency is every bit as critical as bandwidth and in many cases more
so. Latency at scale is also critical, multi-hop networks dictate the
need to use nearest-neighbour algorithms and therefore have trouble
scaling to large CPU counts. It's also harder for newcomers and non
technical people to conceptualise latency and especially scalable
>From code optimisation that I've done in the past I've also found that
bandwidth is easier to hide via pipelining than latency and therefore is
less critical to wall clock time.
Also don't forget that SMP boxes are getting wider, think in terms of
Mb/s/CPU and todays 900Mb/s network bandwidth suddenly doesn't sound
that much. The good news here however is that the large SMPs tend to
have multiple PCI-X busses so can use multiple networks effectively.
More information about the Beowulf