[Beowulf] Not quite Walmart, or, living without ECC?
ajt at rri.sari.ac.uk
Tue Nov 27 13:56:34 PST 2007
David Mathog wrote:
> Tony Travis wrote:
>> Memtest86+ is fine for 'burn-in' tests, but it does not do a realistic
>> memory stress test under the conditions that normal applications run.
> Wow, deja vu. I just remembered we had almost exactly this same
> discussion 2 years ago, in fact I apparently sent you my hacked up
> version of memtester which has delays in it between the write and read
> cycles, to allow it to catch bit fade (due to radiation or whatever).
Yes, I remember ;-)
> One thing I still don't get though, if memtester is catching memory
> errors which only appear when _other parts of the system are active_
> does replacing the "bad" memory actually cure these problems? That is,
> if memtest86+ runs cleanly and memtester finds problems, is it really
> the memory which is the issue?
Yes, replacing the faulty memory does fix the problem in the majority of
cases. However, I've had to replace a couple of faulty CPU's. I do think
memtester is a much more realistic stress test, but you can't use it to
test memory exhaustively like you can with memtest86+, so you still need
to do both tests. I also run memtester randomly as a confidence building
Dr. A.J.Travis, | mailto:ajt at rri.sari.ac.uk
Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt
Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751
Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687
More information about the Beowulf