[Beowulf] Should I go for diskless or not?
ashley at pittman.co.uk
Fri May 15 10:12:28 PDT 2009
On Fri, 2009-05-15 at 06:43 -0400, Lawrence Stewart wrote:
> I'll echo the remarks about swapping, there is a large patch set for
> swapping over IP, and we don't run that. In fact right now we run
> without swap space, and vm_overcommit_ratio set to "90". This is
> generous enough that we're not having problems, even on large
> with running out of memory. Everyone seems to agree that having some
> swap space is good for stability, so we do plan to add swap at some
> point. We've got a new network block device that can swap over the
> interconnect (without any allocations) at about 2 GB/s which is
> good enough to make DSM interesting. If you have local disks, using
> them for swap will work fine.
Another problem which nobodys mentioned yet is where are you going to
swap too? Sure each node might have 2GB/s network bandwidth to play with
but no frontend is going to cope with more than a handful of nodes
swapping at once. It might be viable for a network of diskless
workstations but for a cluster forget it.
The only way that network swapping can make sense in a cluster is if you
know the application doesn't fit in memory and can allocate some extra
nodes to host the swapped memory, preferably swapping over the network
to RAM on a remote machine. This doubles the nodes required to run your
job however and makes scheduling it with normal jobs impossible.
More information about the Beowulf