Diagnostic tools
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Donald Becker becker at scyld.comMon Oct 21 08:42:53 PDT 2002
- Previous message: Diagnostic tools
- Next message: Diagnostic tools
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, 21 Oct 2002 alvin at Maggie.Linux-Consulting.com wrote: > On Mon, 21 Oct 2002, Manel Soria wrote: > > > We are looking for a diagnostic tool that (ideally) would > > allow us to determine what component/s of a node fail. It should > > test the processor, RAM, disk and network cards under heavy load > > but in repeatable conditions. > > testing those items individually is a lot of work ... > > test process/proceedure is more important than the actual test ?? > > - many different cpu/disk/memory/nic tests > http://www.Linux-1U.net/Diags/ The only Linux hardware tests you list are a CPU test (cpuburn) and many entries for memtest86. You missed several Linux "SMART"-based disk diagnostics tools and the NIC diagnostics at http://www.scyld.com/diag/index.html > > -Monitor the CPU temperature. > > use i2c-2.6.5 and lm_sensors to read the health monitors on the > mbotherboard > > also get a regular digital thermometer from your local hw store > for sanity checking Good advice, since lm_sensors can only guess what type of thermal sensor is on the motherboard. When the guessed calibration is off, it is usually way off, but you cannot count on that. -- Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Scyld Beowulf cluster system Annapolis MD 21403 410-990-9993
- Previous message: Diagnostic tools
- Next message: Diagnostic tools
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
