[Beowulf] scheduler policy design

Toon Knapen toon.knapen at fft.be
Tue Apr 24 05:30:31 PDT 2007

Tim Cutts wrote:

>> but what if you have a bi-cpu bi-core machine to which you assign 4 
>> slots. Now one slot is being used by a process which performs heavy 
>> IO. Suppose another process is launched that performs heavy IO. In 
>> that case the latter process should wait until the first one is done 
>> to avoid slowing down the efficiency of the system. Generally however, 
>> clusters take only time and memory requirements into account.
> I think that varies.  LSF records the current I/O of a node as one of 
> its load indices, so you can request a node which is doing less than a 
> certain amount of I/O.  I imagine the same is true of SGE, but I 
> wouldn't know.

Indeed, using SGE you could also take this into account. However if 
someone submits 4 jobs, the jobs do not directly start to generate heavy 
I/O. So the scheduler might think that the 4 jobs can easily coexist on 
this same node. However, after a few minutes all 4 jobs start eating 
disk BW and slow the node down horribly. What would your suggestion be 
to solve this ?



