[Beowulf] User resource limits

Prentice Bisbal prentice at ias.edu
Fri Jun 13 05:38:13 PDT 2008

Mark Hahn wrote:
>>> Unfortunately the kernel implementation of mmap() doesn't check
>>> the maximum memory size (RLIMIT_RSS) or maximum data size (RLIMIT_DATA)
>>> limits which were being set, but only the maximum virtual RAM size
>>> (RLIMIT_AS) - this is documented in the setrlimit(2) man page.
>>> :-(
> I think it's a perfectly reasonable choice.  RSS enforcement means
> accounting and checks on what would otherwise be fast paths.
> besides, I think it also lacks transparency, since a process's RSS is
> affected by random other system events, other users, etc.
> using a memory limit that is triggered on actual allocation events
> (mmap, brk) makes a lot of sense to me, and that means virtual size,
> exactly what RLIMIT_AS does...
>> limits.conf parlance) I would have to limit AS < RAM to keep a user from
>> using all RAM. Since AS includes virtual memory, and VM = RAM + swap,
>> wouldn't I be limiting users a little more than I'd hoped?
> I don't follow that.  why would you want to keep a user from using all
> ram (which assumes the ram is otherwise free/unused/wasted)?

Because these are multi-user systems that are not managed by a queuing
system, and users are running large jobs on them. Once every couple of
days, we have to hard-reboot one of them b/c they become unresponsive
when they run out of memory (OOM messages in the logs verify this). I
think I explained this in more detail in my original e-mail.

> the only real trick with RLIMIT_AS and vm.overcommit=2 is that it's hard
> to predict the vsz of processes.  normally, vsz is modestly larger than
> rss,
> but sysv shm, mmaped libries perturb this, as well as the dubious practice
> (more common in fortran I think) of allocating max-sized arrays even if
> you only ever use a small part.
> my experience so far is that setting RLIMIT_AS to around ram size is
> reasonable.  we have had good luck with swap=ram (or a little more),
> vm.overcommit_memory=2 and vm.overcommit_ratio=100.  the overcommit
> settings alone do a poor job - you also need RLIMIT_AS.

vm.overcommit?  never heard of that before. I'm going to google that now.


More information about the Beowulf mailing list