[Beowulf] Node Drop-Off
twm at tcg-hsv.com
Sun Nov 12 13:13:53 PST 2006
Hello All -
I have a compute node that has started dropping off. When I say drop
off, I mean the node (while running a job) will lose all connectivity
and the machine does not respond. I have viewed the logs and can find
no reason for the node to cease functioning. Let me state that this
behavior did not occur until after a processor upgrade, BIOS upgrade and
OS upgrade. I went in to the BIOS and made a few changes that seemed to
prolong it even though its occurrence was mostly random. If I leave the
node idle, it will run for days.
Has anyone ever seen such behavior?
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 336 bytes
Desc: not available
Url : http://www.scyld.com/pipermail/beowulf/attachments/20061112/70c34cb5/twm.vcf
More information about the Beowulf