[Beowulf] Can one Infiniband net support MPI and a parallel filesystem?

Mark Hahn hahn at mcmaster.ca
Thu Aug 14 09:44:07 PDT 2008


>> It appears we've averaged almost 77% utilisation
>> since the beginning of 2004 (when our current usage
>> system records begin).
>> 
> Thank you very much for the data point!
>
> I've insisted here that above 70% utilization is very good,
> given the random nature of demand and jobs on queues in the academia, etc.

that sounds very strange to me.  do you really mean that 
30% of your cpu time is idle?  I wonder whether there could be a big
difference in methodology.  for instance, if you're using an MPI library
(probably based on tcp) that doesn't spin-wait but blocks as for disk IO
say 20% of the time, then you might consider this to be 80% utilization.
an MPI that spin-waits might show 100% with the same perf/throughput.

70% utilization is terrible if you really mean "fraction of allocatable cpu
time occupied by jobs".  that is at the job scheduler level, not at the 
kernel scheduler level.

> However, some folks would want more than 90% efficiency to get happy.

I would be embarassed to have less than 90%.  perhaps 70% would make sense
for a cluster dedicated to a small or narrowly-defined group.  I find that 
a sufficient userbase means you _always_ have something to run, of any 
size/resource available.



More information about the Beowulf mailing list