[Beowulf] New member, upgrading our existing Beowulf cluster

Gerald Creager gerry.creager at tamu.edu
Thu Dec 3 13:01:15 PST 2009


Prentice Bisbal wrote:
> Greg Lindahl wrote:
>> On Thu, Dec 03, 2009 at 10:40:12AM -0500, Prentice Bisbal wrote:
>>
>>> if a single node goes down, you need to take down all the
>>> nodes in the chassis before you can remove the dead node. Not very
>>> practical.
>> Eh? What's so hard about marking the other nodes as unusable in your
>> batch system, and waiting for them to become free?
>>
>> -- greg
>>
> 
> I didn't say it was hard - just impractical. ;)
> 
> I thought the same thing when HP told me the nodes weren't
> hot-swappable. But then when I learned the SuperMicros were hot
> swappable, I figured if SuperMicro can do it, why not HP?
> 
> I'm sure you'll agree that taking just one node down instead of 4 is
> more convenient, and is less likely to draw the ire of your
> number-crunchers.


Because of our ill-advised choice of a specific APC rack, whenever I've 
got to remove one of the supermicro's from the 2uTwin chassis, I have to 
power 'em all off.  I then have to ease the chassis out so I've enough 
room to get a node out.  I learned to "just do it" after a 2 month whine-in.

gerry



More information about the Beowulf mailing list