Maximum room temperature

Josip Loncaric josip at icase.edu
Tue Apr 23 09:33:05 PDT 2002


Manel Soria wrote:
> 
> I'm wondering what is the maximum reasonable ambient
> temperature to have in a cluster room. In our room
> with 72 nodes we have about 29-30 oC (84-86 oF).
> Is this too high ? Can this be the cause of hardware
> failures ?

Yes it can.  We start to lose hardware (disks, etc.) whenever
temperature climbs to 85 deg. F (30 deg. C).  Our computer room AC is
set to maintain about 70 deg. F (21 deg. C), and we turn on spare AC
units if this reaches 75 deg. F (about 24 deg. C).  By 80 deg. F (27
deg. C), we start shutting down machines.

BTW, hardware temperature monitoring measures temperatures inside the
boxes, which are higher.  CPU temperatures vary a lot and can easily
reach 55 deg. C when loaded; motherboard temperatures are more stable
(typically about 29-30 deg. C).  We also wrote some periodic scripts
which can e-mail root or even trigger automatic cluster shutdown when
the average motherboard temperatures exceed reasonable limits (e.g.
35-40 deg. C).  Unfortunately, dual CPU machines do not poweroff (Red
Hat's Linux kernel 2.4.9-31smp considers "poweroff" unsafe on SMP
machines) but at least they produce less heat when halted.

Sincerely,
Josip


-- 
Dr. Josip Loncaric, Research Fellow               mailto:josip at icase.edu
ICASE, Mail Stop 132C           PGP key at http://www.icase.edu./~josip/
NASA Langley Research Center             mailto:j.loncaric at larc.nasa.gov
Hampton, VA 23681-2199, USA    Tel. +1 757 864-2192  Fax +1 757 864-6134



More information about the Beowulf mailing list