Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

Diagnostic tools

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

alvin at Maggie.Linux-Consulting.com alvin at Maggie.Linux-Consulting.com
Mon Oct 21 04:38:46 PDT 2002


hi ya manel

On Mon, 21 Oct 2002, Manel Soria wrote:

> Hi,
> 
> We are looking for a diagnostic tool that (ideally) would
> allow us to determine what component/s of a node fail. It should
> test the processor, RAM, disk and network cards under heavy load
> but in repeatable conditions.

testing those items individually is a lot of work ...

test process/proceedure is more important  than the actual test ??

- many different cpu/disk/memory/nic tests
	http://www.Linux-1U.net/Diags/
	( not quite finished yet...

- many ways to tweek the system to maximize its performance
	http://www.Linux-1U.net/Tuning/
	( way-incomplete but .. maybe its useful to ya ??

> Other desirable features would be:
> -Run from a floppy, without OS in the disk, in order to allow
>  good quality control of the new nodes.

running from floppy is a wee bit tricky to squeeze your kernel
into 1.44MB ( 1.77MB ) that can boot it and get iton the network
	- newer mb might be simpler/easier for network booting too
	( diskless etwork booting is easier.. than a floppy boot

- use a 4MB compact flash ... and the problem is trivial
  to be a diskless node for booting

> -Monitor the CPU temperature.

use i2c-2.6.5 and lm_sensors to read the health monitors on the
mbotherboard

also get a regular digital thermometer from your local hw store
for sanity checking

have fun
alvin

> 
> We would appreciate suggestions and comments about this topic.
> 
> Thanks for your help.




More information about the Beowulf mailing list