[Beowulf] which 24 port unmanaged GigE switch?

Michael Di Domenico mdidomenico4 at gmail.com
Mon Apr 5 13:27:32 PDT 2010


A couple small 10node clusters we have setup used to routinely drop
off the network and the switch would have to be hard reset for it to
return.  Granted we didn't do any deep analysis (just replaced with
cisco) and it could be attributed to some bad switches, but i've also
seen this at home with some 1gb switches i bought.

over the years i've been using netgear enterprise and home products,
they are wonderful in light use 80-85% max throughput, but once you
hit the 90+ areas they seem to start to degrade either through packet
loss or over heating

we still buy them for our management network, they're cheaper then hp
and we just need it for kickstarts, snmp, etc..

as joe said, its just our opinion, your mileage may vary

On Mon, Apr 5, 2010 at 3:40 PM, David Mathog <mathog at caltech.edu> wrote:
> Michael Di Domenico
>> I would have to agree.  I have Netgears in my lab now and for light
>> use they seem to be okay, but once you run a communications heavy MPI
>> job over them they seem to fall down
>
> Please define "fall down".
>
> One test I have applied to a switch (only 100baseT) to see if it could
> handle "full traffic" was running the script below on all nodes:
>
> #!/bin/bash
> TINFO=`topology_info`
> NEXT=`echo $TINFO | extract -mt -cols [3]`
> if [ $NEXT != "none" ]
> then
>  TIME=`accudate -t0`
>  dd if=/dev/zero bs=4096 count=1000000 | rsh $NEXT 'cat - >/dev/null'
>  accudate -ds $TIME >/tmp/elapsed_${HOSTNAME}.txt
> fi
>
> Where topology_info defines a linear chain through all nodes, and what
> ends up in the elapsed_HOSTNAME.txt files is transmission time from this
> to the next node.  extract and accudate are mine, the former is like
> "cut" and the latter is just used here to calculate an elapsed time.
>
> This is slightly apples and oranges because in the two node (reference)
> test the target node is only accepting packets, whereas when they are
> all running it is also sending packets, and those compete with the ack's
> going back to the first node.  The D-Link switch held up quite well, I
> thought.  One pair of nodes tested this way completed in 350 seconds
> (+/-), whereas it and the others took 370-380 seconds when they were all
> running at once (20 compute nodes, first only sends, last only
> receives).  That is, 11.7 MB/sec for the pair, 10.8 MB/sec for all
> pairs.  For GigE it should come out at 117 and 108 (or so), if the
> switch can keep up.
>
> I'm curious what the netgears and HP do in a test like this.  If anybody
> would like to try this, all the pieces for this simple test (if you can
> run binaries for a 32 bit x86 environment) are here:
>
>  http://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/testswitch.tar.gz
>
> (For other platforms obtain source for accudate and extract from here
>
> http://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/drm_tools.tar.gz
> )
>
> Start the jobs simultaneously on all nodes using whichever queue system
> you have installed.  Be sure to run it once first with a small count
> number to force anything coming over nfs into cache before doing the big
> test.  (Or one could run netpipe on each pair of nodes, or anything else
> really that loads the network.)
>
> Regards,
>
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
>




More information about the Beowulf mailing list