[Beowulf] Re: ECC Memory and Job Failures (Huw Lynes)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Prentice Bisbal prentice at ias.eduFri Apr 24 05:55:55 PDT 2009
- Previous message: [Beowulf] Re: ECC Memory and Job Failures (Huw Lynes)
- Next message: [Beowulf] Re: ECC Memory and Job Failures (Huw Lynes)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Gerry Creager wrote: > David Mathog wrote: >> Huw Lynes <lynesh at cardiff.ac.uk> wrote: >> >>> http://blog.revolution-computing.com/2009/04/blame-it-on-cosmic-rays.html >>> >>> >>> Apparently someone ran a large cluster job with both ECC and none-ECC >>> RAM. They consistently got the wrong answer when foregoing ECC. >> >> There were not very many details given. I would not rule out the >> possibility that the nonECC memory was slightly faulty, and that the >> observed errors had nothing to do with gamma rays at all. A better test >> would have been to use the same ECC memory for both tests, and to turn >> ECC memory correction on and off in the BIOS. > > Where's Jim Lux. I'm sure he's an opinion on this, too... > Opinion? I think he could write a book on this topic! Last time this issue came up, he included links to several papers on this topic published by Boeing. As you go up in the atmosphere, the [prevalence|probability|concentration] of cosmic rays goes up significantly. Boeing has done a lot of research on this topic, since it can affect the operation of their [products|weapons]. -- Prentice
- Previous message: [Beowulf] Re: ECC Memory and Job Failures (Huw Lynes)
- Next message: [Beowulf] Re: ECC Memory and Job Failures (Huw Lynes)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
