[Beowulf] Odd AMD quad core SuperMicro power off issues

Gerry Creager gerry.creager at tamu.edu
Mon Jul 6 22:13:49 PDT 2009


I learned recently that, regardless of the versioing, it's VERY 
important with SuperMicro to check the BIOS date.

Much more important than I'd thought.  As in, that's the real release 
info on the BIOS.

gerry

Steve Cousins wrote:
> 
> 
>> ----- "Chris Samuel" <csamuel at vpac.org> wrote:
>>
>> The compute nodes are using SuperMicro H8DM8-2 based
>> with 32GB of ECC RAM.
> 
> Hi Chris,
> 
> I had MCE crashes on a Supermicro system (quad Xeon quad-core 2.4 Ghz) 
> that was driving me nuts for quite a while. It would take a couple of 
> months to crash which doesn't sound bad but it was a real pain. I bought 
> the machine from ASL and they worked with Supermicro to fix a microcode 
> issue.
> 
> The reason I mention this is that at least in this case, the BIOS 
> version was same before I ran the update and after.
> 
> Here is part of a message I got from ASL:
> 
>> Note that you will be updating the BIOS from version 1.0b to 1.0b. In 
>> Supermicro wisdom, they released several updates using the same 
>> revision number.
> 
> After updating my 1.0b BIOS to the new 1.0b the machine has been running 
> solid since Christmas.
> 
> So, if you have two machines, one that crashes and one that doesn't, 
> check the dates of the BIOS's even if the BIOS versions match.
> 
> I hope this helps.
> 
> Steve
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Gerry Creager -- gerry.creager at tamu.edu
Texas Mesonet -- AATLT, Texas A&M University
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843



More information about the Beowulf mailing list