intermittent crashing of programs
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Kris Thielemans kris.thielemans at csc.mrc.ac.ukThu Feb 21 06:52:41 PST 2002
- Previous message: [Fwd: CLUSTER 2002 Call For Papers (due April 12)]
- Next message: intermittent crashing of programs
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
(2nd resubmit after subscribing with a different email address...) Hi, we have a cluster of 4 dual Pentium III 600 MHz systems, running SuSE Linux 7.1. On one of the PCs, our programs occasionally crash with a segmentation fault. This also happens with an ordinary serial program with all its IO to local disks. (It does use NIS to get user info though, so I cannot easily test it without network). The crash NEVER occurs on any of the other systems. At the time of the crash, I get the following message in /var/log/messages ----------------------------------------------------------------- Feb 21 14:22:58 pp4 kernel: Uhhuh. NMI received. Dazed and confused, but trying to continue Feb 21 14:22:58 pp4 kernel: You probably have a hardware problem with your RAM chips ----------------------------------------------------------------- So, we ran memtest86-2.5 for 4 days continuously. No error was reported. Any suggestions on how we figure out what the problem is (aside from replacing all memory chips)? Is it necessarily RAM, or could it be e.g. the hard disk controller or so? Thanks, Kris Thielemans (kris.thielemans <at> ic.ac.uk) Imaging Research Solutions Ltd Cyclotron Building Hammersmith Hospital Du Cane Road London W12 ONN, United Kingdom web site address: http://www.irsl.org/~kris
- Previous message: [Fwd: CLUSTER 2002 Call For Papers (due April 12)]
- Next message: intermittent crashing of programs
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
