[Beowulf] New member, upgrading our existing Beowulf cluster

Mark Hahn hahn at mcmaster.ca
Thu Dec 3 11:29:10 PST 2009


>> if a single node goes down, you need to take down all the
>> nodes in the chassis before you can remove the dead node. Not very
>> practical.
>
> Eh? What's so hard about marking the other nodes as unusable in your
> batch system, and waiting for them to become free?

depends on your max job length.  but yeah, idling three nodes for a week
is not going to be noticable in anything but a quite small cluster...



More information about the Beowulf mailing list