[Beowulf] Good 1 Gbit switches - which ones?

Andrew Latham lathama at yahoo.com
Tue May 25 10:05:19 PDT 2004


Nice work

I think we could really use a "Mikes Hardware" of the HPC world. I use the list
to figure this stuff out but I think that a (unbiased) hardware review and
comparison website centered around HPC products would be great. I will help out
on that.

--- Bill Broadley <bill at cse.ucdavis.edu> wrote:
> On Fri, May 21, 2004 at 12:49:47PM -0700, Konstantin Kudin wrote:
> >  Hi there,
> > 
> >  Can anyone offer an insight with respect to 1 Gbit switches for a
> > Beowulf cluster? There are all these reports that a lot of inexpensive
> > switches on the market tend to choke under heavy internal traffic. Can
> > anyone suggest an affordable switch with good internal bandwidth, which
> > was tested under heavy load, and actually worked well?
> 
> I've written a small benchmark which allows testing various number of
> MPI_INTs in a message between a variable number of pairs of nodes.
> 
> With a 32 node dual opteron cluster and a Nortel Baystack 470 48 port
> switch:
> 
>     # of 
>     MPI_INT  BetweenPairs of           Wallclock  Latency          Bandwidth
>
==============================================================================
> size=     1, 131072 hops,  8 nodes in  7.04 sec ( 53.7 us/hop)     73 KB/sec
> size=     1, 131072 hops, 16 nodes in  7.46 sec ( 56.9 us/hop)     69 KB/sec
> size=     1, 131072 hops, 24 nodes in  7.51 sec ( 57.3 us/hop)     68 KB/sec
> size=     1, 131072 hops, 32 nodes in  8.44 sec ( 64.4 us/hop)     61 KB/sec
> (19% or so drop)
> 
> size=    10, 131072 hops,  8 nodes in  7.15 sec ( 54.5 us/hop)    716 KB/sec
> size=    10, 131072 hops, 16 nodes in  7.39 sec ( 56.4 us/hop)    693 KB/sec
> size=    10, 131072 hops, 24 nodes in  7.59 sec ( 57.9 us/hop)    674 KB/sec
> size=    10, 131072 hops, 32 nodes in  8.06 sec ( 61.5 us/hop)    635 KB/sec
> (13% or so drop)
> 
> size=  1000, 16384 hops,  8 nodes in  1.93 sec (117.8 us/hop)  33163 KB/sec
> size=  1000, 16384 hops, 16 nodes in  1.96 sec (119.6 us/hop)  32652 KB/sec
> size=  1000, 16384 hops, 24 nodes in  1.98 sec (120.6 us/hop)  32400 KB/sec
> size=  1000, 16384 hops, 32 nodes in  2.20 sec (134.1 us/hop)  29129 KB/sec
> (13% or so drop)
> 
> size= 10000, 16384 hops,  8 nodes in  9.71 sec (592.5 us/hop)  65930 KB/sec
> size= 10000, 16384 hops, 16 nodes in  9.92 sec (605.2 us/hop)  64543 KB/sec
> size= 10000, 16384 hops, 24 nodes in 10.13 sec (618.4 us/hop)  63164 KB/sec
> size= 10000, 16384 hops, 32 nodes in 17.47 sec (1066.4 us/hop)  36629 KB/sec
> (80% or so drop)
> 
> size=100000, 16384 hops,  8 nodes in 100.00 sec (6103.5 us/hop)  64000 KB/sec
> size=100000, 16384 hops, 16 nodes in 104.72 sec (6391.3 us/hop)  61118 KB/sec
> size=100000, 16384 hops, 24 nodes in 103.68 sec (6328.0 us/hop)  61730 KB/sec
> size=100000, 16384 hops, 32 nodes in 134.14 sec (8187.3 us/hop)  47711 KB/sec
> (34% or so drop)
> 
> Seems like in all cases I'm seeing a substantial drop off by the time
> I keep 32 ports busy, I suspect the drop off at 48 would be even worse.
> 
> Does this seem like a reasonable way to benchmark switches?  Anyone
> have suggested improvments or better tools?  If people think this would
> be valuable I could clean up the source and provide a central location
> for storing benchmark results.
> 
> -- 
> Bill Broadley
> Computational Science and Engineering
> UC Davis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

=====
*----------------------------------------------------------*
Andrew Latham AKA: LATHAMA (lay-th-ham-eh) - LATHAMA.COM
LATHAMA at LATHAMA.COM - LATHAMA at YAHOO.COM
If yahoo.com is down we have bigger problems than my email!
*----------------------------------------------------------*



More information about the Beowulf mailing list