[Beowulf] New member, upgrading our existing Beowulf cluster
jlb17 at duke.edu
Thu Dec 3 11:35:45 PST 2009
On Thu, 3 Dec 2009 at 2:29pm, Mark Hahn wrote
>>> if a single node goes down, you need to take down all the
>>> nodes in the chassis before you can remove the dead node. Not very
>> Eh? What's so hard about marking the other nodes as unusable in your
>> batch system, and waiting for them to become free?
> depends on your max job length. but yeah, idling three nodes for a week
> is not going to be noticable in anything but a quite small cluster...
But doesn't the engineer in you just bristle at the (admittedly, rather
slight) inefficiency? Call me OCD (you wouldn't be the first), but it
just bugs me.
QB3 Shared Cluster Sysadmin
More information about the Beowulf