[Beowulf] ECC exerciser/exorciser?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Greg Lindahl lindahl at pbm.comMon Jan 26 13:53:10 PST 2009
- Previous message: [Beowulf] ECC exerciser/exorciser?
- Next message: [Beowulf] ECC exerciser/exorciser?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, Jan 26, 2009 at 10:30:50AM -0500, Mark Hahn wrote: > - first, how would you go about setting a threshold for how high is an > acceptable CE count? we by default are using the mce module, which by > default polls at 1Hz. my thinking is that if we get overflow events > (the multiple error bit is set), then it's too fast. The number should be about zero of these events, if you're near sea level. Almost all of my 100s of 32 gbyte systems show no MCEs. At significant altitude (5000+ feet), I don't know the current number for this generation of memory, but it's probably << 1/week/system. I'm curious about the comments that indicate that the "burnin" CD's HPL isn't as good as running HPL yourself. Very odd. And if you're going to use stream or other programs for testing, do keep in mind that loading down all the cores seems to be very important for causing problems. -- greg
- Previous message: [Beowulf] ECC exerciser/exorciser?
- Next message: [Beowulf] ECC exerciser/exorciser?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
