[Beowulf] User resource limits
prentice at ias.edu
Fri Jun 13 05:38:13 PDT 2008
Mark Hahn wrote:
>>> Unfortunately the kernel implementation of mmap() doesn't check
>>> the maximum memory size (RLIMIT_RSS) or maximum data size (RLIMIT_DATA)
>>> limits which were being set, but only the maximum virtual RAM size
>>> (RLIMIT_AS) - this is documented in the setrlimit(2) man page.
> I think it's a perfectly reasonable choice. RSS enforcement means
> accounting and checks on what would otherwise be fast paths.
> besides, I think it also lacks transparency, since a process's RSS is
> affected by random other system events, other users, etc.
> using a memory limit that is triggered on actual allocation events
> (mmap, brk) makes a lot of sense to me, and that means virtual size,
> exactly what RLIMIT_AS does...
>> limits.conf parlance) I would have to limit AS < RAM to keep a user from
>> using all RAM. Since AS includes virtual memory, and VM = RAM + swap,
>> wouldn't I be limiting users a little more than I'd hoped?
> I don't follow that. why would you want to keep a user from using all
> ram (which assumes the ram is otherwise free/unused/wasted)?
Because these are multi-user systems that are not managed by a queuing
system, and users are running large jobs on them. Once every couple of
days, we have to hard-reboot one of them b/c they become unresponsive
when they run out of memory (OOM messages in the logs verify this). I
think I explained this in more detail in my original e-mail.
> the only real trick with RLIMIT_AS and vm.overcommit=2 is that it's hard
> to predict the vsz of processes. normally, vsz is modestly larger than
> but sysv shm, mmaped libries perturb this, as well as the dubious practice
> (more common in fortran I think) of allocating max-sized arrays even if
> you only ever use a small part.
> my experience so far is that setting RLIMIT_AS to around ram size is
> reasonable. we have had good luck with swap=ram (or a little more),
> vm.overcommit_memory=2 and vm.overcommit_ratio=100. the overcommit
> settings alone do a poor job - you also need RLIMIT_AS.
vm.overcommit? never heard of that before. I'm going to google that now.
More information about the Beowulf