Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Good 1 Gbit switches - which ones?

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Bill Broadley bill at cse.ucdavis.edu
Mon May 24 17:12:20 PDT 2004


On Fri, May 21, 2004 at 12:49:47PM -0700, Konstantin Kudin wrote:
>  Hi there,
> 
>  Can anyone offer an insight with respect to 1 Gbit switches for a
> Beowulf cluster? There are all these reports that a lot of inexpensive
> switches on the market tend to choke under heavy internal traffic. Can
> anyone suggest an affordable switch with good internal bandwidth, which
> was tested under heavy load, and actually worked well?

I've written a small benchmark which allows testing various number of
MPI_INTs in a message between a variable number of pairs of nodes.

With a 32 node dual opteron cluster and a Nortel Baystack 470 48 port
switch:

    # of 
    MPI_INT  BetweenPairs of           Wallclock  Latency          Bandwidth
==============================================================================
size=     1, 131072 hops,  8 nodes in  7.04 sec ( 53.7 us/hop)     73 KB/sec
size=     1, 131072 hops, 16 nodes in  7.46 sec ( 56.9 us/hop)     69 KB/sec
size=     1, 131072 hops, 24 nodes in  7.51 sec ( 57.3 us/hop)     68 KB/sec
size=     1, 131072 hops, 32 nodes in  8.44 sec ( 64.4 us/hop)     61 KB/sec
(19% or so drop)

size=    10, 131072 hops,  8 nodes in  7.15 sec ( 54.5 us/hop)    716 KB/sec
size=    10, 131072 hops, 16 nodes in  7.39 sec ( 56.4 us/hop)    693 KB/sec
size=    10, 131072 hops, 24 nodes in  7.59 sec ( 57.9 us/hop)    674 KB/sec
size=    10, 131072 hops, 32 nodes in  8.06 sec ( 61.5 us/hop)    635 KB/sec
(13% or so drop)

size=  1000, 16384 hops,  8 nodes in  1.93 sec (117.8 us/hop)  33163 KB/sec
size=  1000, 16384 hops, 16 nodes in  1.96 sec (119.6 us/hop)  32652 KB/sec
size=  1000, 16384 hops, 24 nodes in  1.98 sec (120.6 us/hop)  32400 KB/sec
size=  1000, 16384 hops, 32 nodes in  2.20 sec (134.1 us/hop)  29129 KB/sec
(13% or so drop)

size= 10000, 16384 hops,  8 nodes in  9.71 sec (592.5 us/hop)  65930 KB/sec
size= 10000, 16384 hops, 16 nodes in  9.92 sec (605.2 us/hop)  64543 KB/sec
size= 10000, 16384 hops, 24 nodes in 10.13 sec (618.4 us/hop)  63164 KB/sec
size= 10000, 16384 hops, 32 nodes in 17.47 sec (1066.4 us/hop)  36629 KB/sec
(80% or so drop)

size=100000, 16384 hops,  8 nodes in 100.00 sec (6103.5 us/hop)  64000 KB/sec
size=100000, 16384 hops, 16 nodes in 104.72 sec (6391.3 us/hop)  61118 KB/sec
size=100000, 16384 hops, 24 nodes in 103.68 sec (6328.0 us/hop)  61730 KB/sec
size=100000, 16384 hops, 32 nodes in 134.14 sec (8187.3 us/hop)  47711 KB/sec
(34% or so drop)

Seems like in all cases I'm seeing a substantial drop off by the time
I keep 32 ports busy, I suspect the drop off at 48 would be even worse.

Does this seem like a reasonable way to benchmark switches?  Anyone
have suggested improvments or better tools?  If people think this would
be valuable I could clean up the source and provide a central location
for storing benchmark results.

-- 
Bill Broadley
Computational Science and Engineering
UC Davis



More information about the Beowulf mailing list