[Beowulf] no lm_sensors, slow system, was: Remote console management
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
David Mathog mathog at mendel.bio.caltech.eduMon Sep 26 08:15:09 PDT 2005
- Previous message: [Beowulf] Cluster Managers Mailing List?
- Next message: [Beowulf] dual-core benefits?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Stuart Midgley <sdm900 at gmail.com> wrote: > > Unfortunately, lm_sensors does not work with our montherboards, <SNIP> > We pull them out of the > cluster and run hardware diagnostics and discover that a fan or > something has died and that the cpu is running hot... and has > consequently slowed down... resulting in longer run times for user > jobs... I just spent a day trying to figure out why upgrading some ASUS A7V266E based workstations, from Mandrake 10.0 to 10.2 (aka 2005LE), caused them to run 5X slower. It turned out that: A. lm_sensors had a change between 2.6.x kernel versions that eliminated the need for a /2 in its config file, resulting in a CPU temp reading of 105C. B. The /2 actually takes place inside the monitor chip, so the monitor chip "thinks" that the system is at 105C. C. The BIOS had an option for controlling overheating detected by the monitor chip that could be set to either "throttle" or "shutdown", and it was set for the former. D. When the CPU was throttled /proc still reported the CPU Mhz at the full speed, even though effectively throttle reduced the Mhz by 5x. Which is a long winded way of saying that you should check your BIOS and see if you have the equivalent of "throttle" set. If so for cluster work you'd be better served by "shutdown". It's a lot less mysterious when a node just shuts down (indicating right up front that a hardware failure is present) than when things start running really, really slowly, for no apparent reason. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech
- Previous message: [Beowulf] Cluster Managers Mailing List?
- Next message: [Beowulf] dual-core benefits?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
