[Beowulf] scheduler policy design
tjrc at sanger.ac.uk
Thu Apr 19 08:05:14 PDT 2007
On 19 Apr 2007, at 3:20 pm, Toon Knapen wrote:
> Tim Cutts wrote:
>> Optimising for throughput, at least with an embarrassingly
>> parallel workload of serial jobs like we have here, is trivial; a
>> single first-come-first-served queue is optimal, as long as the
>> code is well written, and doesn't block too much on shared
>> resources like file servers or databases.
> but what if you have a bi-cpu bi-core machine to which you assign 4
> slots. Now one slot is being used by a process which performs heavy
> IO. Suppose another process is launched that performs heavy IO. In
> that case the latter process should wait until the first one is
> done to avoid slowing down the efficiency of the system. Generally
> however, clusters take only time and memory requirements into account.
I think that varies. LSF records the current I/O of a node as one of
its load indices, so you can request a node which is doing less than
a certain amount of I/O. I imagine the same is true of SGE, but I
> Additionally, in the case above, for optimising the efficiency of
> the node, I might prefer to launch just 1 process which uses 4
> threads to perform multi-threaded (BLAS) calculations.
That could certainly be requested with LSF:
bsub -n 4 -R"select[io < 10] span[hosts=1]" my_four_thread_job
selects a host currently performing less than 10 KB per second, and
requests four job slots on a single node.
More information about the Beowulf