[Beowulf] ECC settings for Opteron 175 + Serverworks HT1000 chipset
    Bruce Allen 
    ballen at gravity.phys.uwm.edu
       
    Wed Jan 25 17:41:02 PST 2006
    
    
  
Dear Beowulf list,
Our new cluster nodes (Supermicro H8SSL-i motherboard) have Opteron 175 
CPUs, (unregistered) ECC memory dimms, and a serverworks HT1000 chipset. 
The BIOS offers a number of ECC configuration options.  I would like 
advice about how to set these.  We're running a recent Linux kernel.
My goals are (1) to have logging in syslog that helps identify if a 
particular memory stick is suffering from a lot of ECC errors and (2) to 
ensure that memory errors are corrected to the maximum extent possible 
without too large an impact on system performance.
The BIOS ECC Configuration options are:
   ECC enable (we'll use 'enabled')
   MCA DRAM ECC logging (enable/disabled)
   ECC Chip Kill (enable/disable)
   DRAM Scrub Redirect (enable/disable)
   DRAM BG Scrub (disable/time in NSEC)
   L2 Cache BG Scrub (disable/time in NSEC)
   Data Cache BG Scrub (disable/time in NSEC)
I would appreciate advice about:
   -- how to configure these settings
   -- pointers to relevant AMD/Serverworks documentation
   -- relevant Linux kernel options/modules
   -- anything else relevant/related
Cheers,
 	Bruce
    
    
More information about the Beowulf
mailing list