[Beowulf] bandwidth: who needs it?

Richard Walsh rbw at ahpcrc.org
Thu Oct 21 13:59:52 PDT 2004


Greg Lindahl wrote:

>> do you have applications that are pushing the limits of MPI bandwidth?
>> for instance, code that actually comes close to using the 8-900 MB/s
>> that current high-end interconnect provides?
>
>Bandwidth is important not only for huge messages that hit 900 MB/s,
>but also for medium sized messages. A naive formula for how long it
>takes to send a message is:
>
>T_size = T_0 + size / max_bandwidth
>
>For example, for a 4k message with T_0 = 5 usec and either 400 MB/s or
>800 MB/s,
>
>T_4k_400M = 5 + 4k/400M = 5 + 10 = 15 usec
>T_4k_800M = 5 + 4k/800M = 5 +  5 = 10 usec
>
>A big difference. But you're only getting 266 MB/s and 400 MB/s
>bandwidth, respectively.
>
>Of course performance is usually a bit less than this naive model. But
>the effect is real, becoming unimportant for packets smaller than ~ 2k
>in this example. The size at which this effect becomes unimportant
>depends on T_0 and the bandwidth.

The above also makes a point about a mid-range regime of message sizes 
whose transfer times are affected ~equally by bandwidth and latency
changes.  Halving the latency in the 4K/800M case above is equivalent
to doubling the bandwidth for a message of this size: 

  T_4k_800M. = 2.5 + 4k/800M  = 2.5 +  5.0 =  7.5 usec
  T_4k_800M  = 5.0 + 4k/800M  = 5.0 +  5.0 = 10.0 usec
  T_4k_1600M = 5.0 + 4k/1600M = 5.0 +  2.5 =  7.5 usec

For a given interconnect with a known latency and bandwidth there is
a "characteristic" message size whose transfer time is equally sensitive
to perturbations in bandwidth and latency (latency and bandwidth piece
of the transfer time are equal).  So, for an "Elan-4-like" interconnect 
characteristic message length would be 1.6k:

  T_4k_800M   = 1.0 + 1.6k/800M   = 1.0 + 2.0 = 3.0 usec
  T_4k_800M   = 2.0 + 1.6k/800M   = 2.0 + 2.0 = 4.0 usec
  T_4k_1600M  = 2.0 + 1.6k/1600M  = 2.0 + 1.0 = 3.0  usec

Messages sizes in the vicinity of the characteristic length will 
respond approximately equally to improvements in either factor.
Messages much larger in size will be more sensitive to bandwidth 
improvements in an interconnect upgrade while message sizes much
smaller will be more sensitive to latency improvements in an upgrade. 

One might argue that bandwidth actually matters more because message
sizes (along with problem sizes) can in theory grow indefinitely (drop 
in some more memory and double you array sizes) while they can be made 
only be so small -- this is a position supported by the rate of storage 
growth, but undermined by slower bandwidth growth and processor count 
increases.  

I think I will keep my bandwidth though ... and take any off of the 
hands of those who ... don't need it ... ;-) ...

rbw

#---------------------------------------------------
# Richard Walsh
# Project Manager, Cluster Computing, Computational
#                  Chemistry and Finance
# netASPx, Inc.
# 1200 Washington Ave. So.
# Minneapolis, MN 55415
# VOX:    612-337-3467
# FAX:    612-337-3400
# EMAIL:  rbw at networkcs.com, richard.walsh at netaspx.com
#         rbw at ahpcrc.org
#
#---------------------------------------------------
# "What you can do, or dream you can, begin it;
#  Boldness has genius, power, and magic in it."
#                                  -Goethe
#---------------------------------------------------
# "Without mystery, there can be no authority."
#                                  -Charles DeGaulle
#---------------------------------------------------
# "Why waste time learning when ignornace is
#  instantaneous?"                 -Thomas Hobbes
#---------------------------------------------------




More information about the Beowulf mailing list