[Beowulf] Seeing ECC errors since upgraded from Opteron 246 to 275
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Greg Lindahl lindahl at pbm.comSat Aug 23 16:51:42 PDT 2008
- Previous message: [Beowulf] Seeing ECC errors since upgraded from Opteron 246 to 275
- Next message: [Beowulf] Seeing ECC errors since upgraded from Opteron 246 to 275
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, Aug 06, 2008 at 02:56:51PM -0500, Jason Clinton wrote: > We have a tool on our website called "breakin" that is Linux 2.6.25.9 > patched with K8 and K10f Opteron EDAC reporting facilities. It can > usually find and identify failed RAM in fifteen minutes (two hours at > most). The EDAC patches to the kernel aren't that great about naming > the correct memory rank, though. > > Make sure you have multibit (sometimes says 4-bit) ECC enabled in your BIOS. > > http://www.advancedclustering.com/software/breakin.html I just gave this a try, and it seems to be a very nicely packaged utility. Thanks for making it available. I've used some similar stuff before, but this is really easy. -- greg
- Previous message: [Beowulf] Seeing ECC errors since upgraded from Opteron 246 to 275
- Next message: [Beowulf] Seeing ECC errors since upgraded from Opteron 246 to 275
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
