[Beowulf] Tracing down 250ms open/chdir calls

David Mathog mathog at caltech.edu
Tue Feb 17 10:20:05 PST 2009


Carsten Aulbert <carsten.aulbert at aei.mpg.de> wrote:

These are all a bit unlikely to be the source of your problem, but they
are worth taking a few seconds to check anyway, since if the results are
not as expected the system performance generally ends up in the dumper.

1.  /cat/proc/cpuinfo
Some of my sytems will fall back to a slower CPU speed if they crash
under certain circumstances.  This makes them a lot slower, but the only
indication other than speed is that the MHz number in cpuinfo changes. 
To recover from this on these systems one must go into the BIOS and
manually reset the CPU speed.

2.  Check CPU power management and verify that if it is on it isn't
locked into the lowest power state.

3.  ethtool eth0  #or eth1, as appropriate for your system
Verify that the NIC parameters are all as expected.

4.  ifconfig eth0
Look for errors, dropped, overruns, etc.


Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech



More information about the Beowulf mailing list