[Beowulf] Setting memory limits on a compute node
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Guy Coates gmpc at sanger.ac.ukWed Jun 9 14:28:57 PDT 2004
- Previous message: [Beowulf] Setting memory limits on a compute node
- Next message: [Beowulf] Setting memory limits on a compute node
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> fundementals of why this is happening. My question, is there any way to > limit a user's total addressable space that his processes can use so that > it doesn't kill the node? > > What is the best to approach this kinda of issue? We have come up with a > few solutions but each one has it's drawbacks. You can set per user memory limits, which on most modern linuxes get set via pam in /etc/security/limits.conf. If you are using a batch queueing system then you can set a per job limit, and the queueing system will zap the job before it kills the node. This is more flexible than global per user limits, as you can set different limits on different machines or queues. There is also the linux kernel out-of-memory (OOM) killer. This is present in most 2.4 kernels, but is not active by default in versions later than 2.4.23. The OOM killer has a set of heuristics to guess what processes to kill when the machine runs out of memory. Unfortunately it is rather difficult to guess what processes to kill in practice, and the OOM has a habit of zapping essential system processes. The OOM only kicks in after all of physical memory + swap has been allocated; this means your process has to full up swap space before it gets zapped. If you have a reasonable size swap partition the node may be catatonic for some time before it recovers. Having said that, for our particular workload, the OOM seem to do the right thing. We keep it around as a last-gasp backup for when the queueing system doesn't manage to zap jobs in time. Cheers, Guy Coates -- Guy Coates, Informatics System Group The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK Tel: +44 (0)1223 834244 ex 7199
- Previous message: [Beowulf] Setting memory limits on a compute node
- Next message: [Beowulf] Setting memory limits on a compute node
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
