[Beowulf] Memory limit enforcement
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Kilian CAVALOTTI kilian at stanford.eduWed Oct 10 12:24:12 PDT 2007
- Previous message: [Beowulf] Memory limit enforcement
- Next message: [Beowulf] best linux distribution
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wednesday 10 October 2007 12:23:14 am Tim Cutts wrote: > We then have a default memory limit on the queues which > is really very low indeed (1.9 GB, typically, because we have 2 GB > RAM per core on our nodes). If the user wants more memory, they have > to set a new higher limit themselves. I'm also relying on LSF's LSB_MEMLIMIT_ENFORCE option to take care of memory-greedy jobs. Before that, I tried to modify the default VM overcommit behavior on individual nodes, playing with sys.vm.overcommit_memory and sys.vm.overcommit_ratio values. By setting overcommit_memory=2 and an appropriate overcommit_ratio, you can basically prevent any swapping. The result is that processes' malloc()s going beyond the limits are denied. This is cool from the sysadmin standpoint, since the greedy applications are killed before bringing the machine to its knees. But it may as well happen that an application trying to use the last few available MBs gets killed, while another one has already allocated several GBs, which is not especially fair. And on top of that, most scientific applications are not very careful about checking errors. So our users were beginning to complain that their applications were crashing without any reason when they were reaching the overcommit limits. Which made me realize that this solution was probably not that optimal. So LSF per-job memory limits enforcement did the trick for us: an esub script to check that user can't request funny limits, and jobs using more that requested get killed. That's good for serial jobs. But parallel (read MPI) jobs are a different can of worms. Say you have 2 dual-cpu nodes, with 4GB each. A user can submit a job using 4 CPUs and 6GB of memory without any problem as long as those 6GB are equally balanced between the two nodes. But since LSF conception of the memory limits is *per job*, it means that, for this specific job, we need to set -M6000000 if we want it to run. And this limit won't prevent a process from this job to use more than 4GB on the first node, making it unusable... So anyway, no solution is perfect. I guess that what the Linux kernel really misses are memory quotas. Per user. Exactly like disk quotas. That would be *really* neat and solve a whole range of problems. > When they do that, we have > supplied LSF with an esub script which then checks that the user has > supplied both the new memory, and a suitable resource selection and > reservation option. If they have not, the job is rejected. So for > example, if the user asks for a 6 GB memory limit, the esub will > check that they have requested a machine with at least 6GB of free > memory, and then reserve that memory with the scheduler. For > example: > > -M6000000 -R"select[mem>6000] rusage[mem=6000]" I'm not 100% certain here, but I would have assumed that it would be the scheduler's job to select a host with enough ressources to run the job. So from my understanding, specifying -R"rusage[mem=6000]" would be sufficient to select a machine which 6GB available. But I may have missed some LSF subtleties. :) Cheers, -- Kilian
- Previous message: [Beowulf] Memory limit enforcement
- Next message: [Beowulf] best linux distribution
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
