[Beowulf] Should I go for diskless or not?

Greg Lindahl lindahl at pbm.com
Thu May 14 15:26:33 PDT 2009


On Thu, May 14, 2009 at 02:51:24PM -0700, Bill Broadley wrote:

> * When I tried it the kernel wasn't really guaranteed to work with remote
>   swap, there was a network block layer that looked rather immature and
>   claimed it would avoid the I need an allocate a buffer so I can talk to
>   the network so I can swap and have more memory problem.  Swapping over
>   network might well require a custom kernel compile.

I don't think anyone has gotten network swapping to really work -- by
definition, you often page heavily when you are low on free memory,
and there are many places memory is allocated while running the code
that swaps over the network, plus the network stack. Those patches
were trying to close all of the holes, but...

Swapping to a disk is much, much easier, but I've had 2 of my desktops
corrupt their RAID-1 system disks while paging heavily and then
running out of memory. Again, the RAID layer provides an opportunity
to screw up.

> Doug:  Diskless provisioning is usually easier to manage.
> 
> Hmm, not sure I buy that one. Pretty much any decent cluster distribution should:
> * allow you to add a compute node without much more than plugging it in
>   and telling it to PXE boot (diskless or diskfull)
> * Allow you to push a configuration cluster wide
> * allow you to reinstall/reboot all nodes.

You forgot
  * Check that the nodes have the right files

Trust but verify, and all that.

-- greg




More information about the Beowulf mailing list