[Beowulf] Not quite Walmart, or, living without ECC?

David Mathog mathog at caltech.edu
Tue Nov 27 09:54:13 PST 2007


Tony Travis wrote:
> Memtest86+ is fine for 'burn-in' tests, but it does not do a realistic 
> memory stress test under the conditions that normal applications run. 

Wow, deja vu.  I just remembered we had almost exactly this same
discussion 2 years ago, in fact I apparently sent you my hacked up
version of memtester which has delays in it between the write and read
cycles, to allow it to catch bit fade (due to radiation or whatever).

One thing I still don't get though, if memtester is catching memory
errors which only appear when _other parts of the system are active_
does replacing the "bad" memory actually cure these problems?  That is,
if memtest86+ runs cleanly and memtester finds problems, is it really
the memory which is the issue?

Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech



More information about the Beowulf mailing list