Scyld performance problems

Donald Becker becker at scyld.com
Mon Aug 5 10:48:03 PDT 2002


On Mon, 5 Aug 2002, Tom A Krewson wrote:

> I am having a performance problem with my Scyld Beowulf.

What Scyld version?

Scyld should have a MPI job start-up time that is 10-15X faster than a
reference generic MPICH using 'rsh', DNS, and local executables.  After
start-up, runtime should be the same as MPICH (or GM-MPI for Myrinet).

Most performance problems have been traced to Ethernet issues.
Specifically,
   your switch should support multicast without dropping
   your switch should be set to autonegotation, not forced full duplex
   you may need to update your driver set, especially for some 3c905 NICs.

> performance is even less at .09 Gflops. I have tried 3 different switches,
> and the cheapest (I assume a cut through switch) gives the best
> performance. 

This hints at a network problem.  What does 'cat /proc/net/dev' report
about errors?

Note that for smaller clusters, inexpensive unmanaged switches perform
better than expensive managed switches that try to interpret multicast
traffic.

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993




More information about the Beowulf mailing list