[Beowulf] Strange hardware? problems
cwest at astro.umass.edu
Fri Apr 27 11:26:26 PDT 2007
Have you tried to look for bios updates for the motherboards? Looking at
the motherboard BIOS page shows lots of fixes.
It might be worth checking the two machines are running the same BIOS
versions. My guess is the S2882-D is running a newer bios, unless you
Also have you tried installing one set of the (failing) 244s into the
(good) S2882-D motherboard, and running the computation?
I'm assuming they are compatible, but you might want to check first.
> We've got two pairs of identical machines:
> - 2 Tyan S2882 dual processor Opteron 244 stepping 10
> - 2 Tyan S2882-D dual processor dual core Opteron 275 stepping 2
> We have two (relatively complicated) numerical models (RAMS and a
> homegrown one) that will blow up in random locations on the 244
> machines but run fine on the 275 machines.
> By blow up it appears the calculations get corrupted in some way and
> the numbers get un-physical in RAMS and the simulation exits. With
> the other model we get segfaults.
More information about the Beowulf