Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] size of swap partition

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mark Hahn hahn at mcmaster.ca
Mon Jun 9 21:58:12 PDT 2008


> We have the potential to have to swap whole jobs out of memory on a complete 
> node.

that was our intent as well.  among other things, this scheme enables
running the cluster "split-personality" - mostly shorter/smaller even
interactive jobs during the day, with big/long jobs running at night.
unfortunately, you need a smart scheduler to do this, and ours is dumb.

>> beleive, it is 2 or more GB per core; we have 16 GB per dual-socket 
>> quad-core Opteron node). What is typical modern swap size today?

are you willing to use a node which is actually occupying 16 GB of swap?

it is possible to tune how the kernel responds to memory crunches - 
for instance, you can always avoid OOM with the vm.overcommit_memory=2
sysctl (you'll need to tune vm.overcommit_ratio and the amount of swap
to get the desired limits.)  in this mode, the kernel tracks how much VM
it actually needs (worst-case, reflected in Committed_AS in /proc/meminfo)
and compares that to a commit limit that reflects ram and swap.

if you don't use overcommit_memory=2, you are basically borrowing VM
space in hopes of not needing it.  that can still be reasonable, considering
how often processes have a lot of shared VM, and how many processes 
allocate but never touch lots of pages.  but you have to ask yourself:
would I like a system that was actually _using_ 16 GB of swap?  if you
have 16x disks, perhaps, but 16G will suck if you only have 1 disk.
at least for overcommit_memory != 2, I don't see the point of configuring
a lot of swap, since the only time you'd use it is if you were thrashing.
sort of a "quality of life" argument.

>> But what are the reccomendations of modern praxis ?

it depends a lot on the size variance of your jobs, as well as 
their real/virtual ratio.  the kernel only enforces RLIMIT_AS
(vsz in ps),assuming a 2.6 kernel - I forget whether 2.4 did 
RLIMIT_RSS or not.

if you use overcommit_memory=2, your desired max VM size determines 
the amount of swap.  otherwise, go with something modest - memory size
or so.  but given that the smallest reasonable single disk these days
is probably about 320GB, it's hard to justify being _too_ tight.



More information about the Beowulf mailing list