Linux cpusets and HPC (was Re: [Beowulf] Can one Infiniband net support MPI and a parallel filesystem?)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Paul Jackson pj at sgi.comThu Aug 14 01:56:02 PDT 2008
- Previous message: Linux cpusets and HPC (was Re: [Beowulf] Can one Infiniband net support MPI and a parallel filesystem?)
- Next message: [Beowulf] Can one Infiniband net support MPI and a parallel filesystem?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Chris wrote: > The main purpose we're using them for is a quick and > easy way to catch users who don't know better doing > things like running an OpenMP code as a single CPU job > and overloading a node (and causing chaos for other > users) when it discovers 8 cores. Let me see if I understand this. Is the following right: Without the cpuset constraint, such a 'bad' job could tell the cluster management software (PBS or Torque or ...) it needed just one CPU, which could end up putting it on a cluster node with say eight CPUs, along with some other jobs that expect to use the other seven CPUs. But then OpenMP code in that 'bad' job could notice it had eight CPUs, think to itself 'wow - cool', and proceed to hog all eight CPUs, messing up those other jobs. With the cpuset constraint, that 'bad' job -will- only be able to use that one CPU, and if OpenMP or other code in that job can't deal reasonably with that circumstance, well, tough, the owner of that job should fix something. But at least the other jobs that were hoping to use the other seven CPUs won't be bothered much by this. Did I say that right? > http://www.supercluster.org/pipermail/torquedev/2007-November/000748.html > http://www.supercluster.org/pipermail/torquedev/2008-January/000842.html > http://www.clusterresources.com/wiki/doku.php?id=torque:3.5_linux_cpuset_support Thanks for the links! -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj at sgi.com> 1.940.382.4214
- Previous message: Linux cpusets and HPC (was Re: [Beowulf] Can one Infiniband net support MPI and a parallel filesystem?)
- Next message: [Beowulf] Can one Infiniband net support MPI and a parallel filesystem?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
