<br>I heard that the major source of memory corruption in servers is the memory bus.<br>And this becomes worse as you add memory sticks.<br>With 8 memory stics that have 8 chips in both sides, you has 128 chips.<br>So the main purpose of ECC is correcting bus errors.
<br><br><br><br><div><span class="gmail_quote">2007/11/26, David Mathog <<a href="mailto:email@example.com">firstname.lastname@example.org</a>>:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I ran a little test over the Thanksgiving holiday to see how common<br>random errors in nonECC memory are. I used the memtest86+ bit fade test<br>mode, which writes all 1s, waits 90 minutes, checks the result, then<br>does the same thing for all 0s. Anyway, this was the best test I could
<br>find for detecting the occasional gamma ray type data loss event. The<br>result: no errors logged in 5 solid days of testing. So this class of<br>error (the type ECC would detect and probably fix) apparently occurs<br>
on these machines at a rate of less than 1 per 840 Gigabyte-hours.<br>Possibly the upper limit is half that if data can only be lost<br>on 1 -> 0 transition, or vice versa. This assumes the bit fade test<br>works, which cannot be independently verified from these results.
<br><br>On the web there are references to an IBM study which found 1 bit<br>error/256Mb/Month, which would have been (.25 *30 * 24) =<br>1 per 180 Gigabyte-hours. If IBM's numbers held for my hardware<br>there should have seen 4 or 5 errors in total. Mine are in a basement
<br>in a concrete building, perhaps that provided some shielding relative to<br>what IBM used for their test conditions.<br><br>The memory was Corsair Twinx1024-3200C2. When first installed all<br>of this memory had run for 24 hours with no errors in normal
<br>memtest86+ testing.<br><br>Regards,<br><br>David Mathog<br><a href="mailto:email@example.com">firstname.lastname@example.org</a><br>Manager, Sequence Analysis Facility, Biology Division, Caltech<br>_______________________________________________
<br>Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a><br>To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf">http://www.beowulf.org/mailman/listinfo/beowulf