[Beowulf] Odd AMD quad core SuperMicro power off issues

Steve Cousins cousins at umit.maine.edu
Mon Jul 6 14:46:06 PDT 2009



> ----- "Chris Samuel" <csamuel at vpac.org> wrote:
>
> The compute nodes are using SuperMicro H8DM8-2 based
> with 32GB of ECC RAM.

Hi Chris,

I had MCE crashes on a Supermicro system (quad Xeon quad-core 2.4 Ghz) 
that was driving me nuts for quite a while. It would take a couple of 
months to crash which doesn't sound bad but it was a real pain. I bought 
the machine from ASL and they worked with Supermicro to fix a microcode 
issue.

The reason I mention this is that at least in this case, the BIOS version 
was same before I ran the update and after.

Here is part of a message I got from ASL:

> Note that you will be updating the BIOS from version 1.0b to 1.0b. In 
> Supermicro wisdom, they released several updates using the same revision 
> number.

After updating my 1.0b BIOS to the new 1.0b the machine has been running 
solid since Christmas.

So, if you have two machines, one that crashes and one that doesn't, check 
the dates of the BIOS's even if the BIOS versions match.

I hope this helps.

Steve



More information about the Beowulf mailing list