[Beowulf] Stress / torture test cluster hardware

John Hearns john.hearns at streamline-computing.com
Sun Oct 8 01:09:11 PDT 2006


Nico Mittenzwey wrote:
> Dear Beowulf mailing list members,
> 
> we are building a new Beowulf cluster at the moment. New hardware is
> arriving every day. Now we want to make certain that this hardware has
> no errors. Therefore we want to stress test them.
> Do you know of any papers, articles, proceedings, tools... concerning
> this topic beside the ones below?
All of the links you suggest look good.

Other things to consider for a stress test are:

Unpack a clean Linux kernel tree. Do a kernel compile. Tar up the 
resulting tree. Repeat, and compare the two resulting tar files.
A linux kernel compile is a surprisingly good way of stressing a system.

On a completed cluster, run HPL on all nodes for an extended period and 
let the cluster heat up.



More information about the Beowulf mailing list