[Beowulf] posting bonnie++ stats from our cluster: any comments about my I/O performance stats?

Brian McNally bmcnally at u.washington.edu
Fri Sep 25 17:02:56 PDT 2009


> One note about bonding/trunking, check it closely to see that it is 
> working the way you expect. We have a cluster with 14 racks of 20 nodes 
> each rack with a 24 port switch at the top. Each of these switches has 
> four ports trunked together back to the core switch. All nodes have two 
> GbE ports but only eth0 was being used. It turns out that all eth0 MAC 
> addresses in this cluster are even. The hashing algorithm on these 
> switches (HP) only uses the last two bits of the MAC address for a total 
> of four paths. Since all MAC's were even it went from four choices to 
> two so we were only getting half the bandwidth.

I'd second testing to make sure bonding/trunking is working before you 
base other performance numbers on it. You may also want to consider 
different bonding modes if you have problems with balancing the traffic 
out. See:

/usr/share/doc/kernel-doc-<ver>/Documentation/networking/bonding.txt

Just getting bonding working in an optimal way can take some time. Use 
the port counters on your switches in conjunction with counters on your 
hosts to make sure traffic is going where you'd expect it to.

> Once the server has the performance you want, I'd use Netcat from a 
> number of clients at the same time to see if your network is doing what 
> you want. Use netcat and bypass any disks (writing to /dev/null on the 
> server and reading from /dev/zero on the client and vica versa) in order 
> to test that bonding is working. You should be able to fill up the 
> network pipes with aggregate tests from multiple nodes using netcat.

You may also consider using iperf for network testing. I used to do raw 
network tests like this but discovered that iperf is often easier to set 
up and use.

--
Brian McNally



More information about the Beowulf mailing list