[Beowulf] Approach For Diagnosing Heat Related Failure?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Dmitry Zaletnev dzaletnev at yandex.ruTue Jul 21 16:04:28 PDT 2009
- Previous message: [Beowulf] Approach For Diagnosing Heat Related Failure?
- Next message: [Beowulf] Resolved - Approach For Diagnosing Heat Related Failure?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jon, > I have a rack full of identical compute > nodes. One of them has become heat sensitive. > > When it's in the warm computer room it crashes. > I can't even run memtest from the CentOS DVD > for 2 seconds. However, when this node is > in my much cooler office everything works > fine. All the other nodes are working fine > in the computer room. I'd such a problem when the plastic clip wich mount the base ring of CPU cooler was broken and CPU cooler was mounted by the rest 3 clips. When I started to save Virtual Machine compiling OpenFOAM from sources, Ubuntu made shutdown on overheat. > > I'm not convinced the problem is actually > the memory. Other than opening the node > to spray cooling liquid when it's in the warm > room, what approach would you use to figure out which > component(s) is(are) failing? > > Cordially, > -- > Jon Forrest > Research Computing Support > College of Chemistry > 173 Tan Hall > University of California Berkeley > Berkeley, CA > 94720-1460 > 510-643-1032 > jlforrest at berkeley.edu > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Sincerely, Dmitry Яндекс.Почта. Поищите спам где-нибудь еще http://mail.yandex.ru/nospam/sign
- Previous message: [Beowulf] Approach For Diagnosing Heat Related Failure?
- Next message: [Beowulf] Resolved - Approach For Diagnosing Heat Related Failure?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
