[Beowulf] Barcelona hardware error: how to detect
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Greg Lindahl lindahl at pbm.comThu Jun 5 11:30:20 PDT 2008
- Previous message: [Beowulf] Barcelona hardware error: how to detect
- Next message: [Beowulf] Barcelona hardware error: how to detect
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, Jun 05, 2008 at 10:09:58PM +0400, Mikhail Kuzminsky wrote: > This was interesting for me also, because I > have no information how this hardware problem may be affected in the > "real life". I have 4 chips with the bug, in 2 servers. I see about 1 lockup per month with my workload, which doesn't include any VMs. (VMs are reputed to trigger the bug quickly.) I found a webpage with the details, and indeed this is what I see: | The system may experience a machine check event reporting an L3 | protocol error has occurred. In this case, the MC4 status register | (MSR 0000_0410) will be equal to B2000000_000B0C0F or | BA000000_000B0C0F. The MC4 address register (MSR 0000_0412) will be | equal to 26h.' -- greg
- Previous message: [Beowulf] Barcelona hardware error: how to detect
- Next message: [Beowulf] Barcelona hardware error: how to detect
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
