[Beowulf] Can one Infiniband net support MPI and a parallel filesystem?

Alan Louis Scheinine alscheinine at tuffmail.us
Thu Aug 14 20:32:48 PDT 2008


This thread has moved to the question of utilization,
discussed by Mark Hahn, Gus Correa and Håkon Bugge.
In my previous job most people developed code, though test runs
could run for days and use as many as 64 cores.  It was
convenient for most people to have immediate access due to
the excess computation capacity whereas some people in top
management wanted maximum utilization.

I was at a parallel computing workshop where other people
described the contrast between their needs and the goals of
their computer centers.  The computer centers wanted maximum
utilization whereas the spare capacity of the various clusters
in the labs were especially useful for the researchers.  They
could bring to bear the computational power of their informally
administered clusters for special tasks such as when a huge
block of data needed to be analyzed in nearly realtime to see
if an experiment of limited duration was going well.

When most work involves code development, waiting for jobs in
a batch queue means that the human resources are not being
used efficiently.  Of course, maximum utilization of computer
resources is necessary for production code, I just want to
emphasize the wide range of needs.

I would like to add that maximum utilization and fast turn-
around are contradictory goals, it would seem to me based
on the following reasoning.  Consider packing a truck with
boxes where the heigth of the boxes represents the number
of cores and the width of the boxes represents the time of
execution (leaving aside third spatial dimension).  To most
efficiently solve the packing problem we would like to have
all boxes visible on the loading dock before we start packing.
On the other hand, if boxes arrive a few at a time and we must
put the boxes into the truck as they arrive (low queue wait time)
then the packing will not be efficient.  Moreover, as a very
rough estimate, the size of the box defines the scale of the
problem, specifically, if the average running time is 4 hours,
then to have efficient "packing" the time spent waiting in a
queue must on the order of at least 4 and more likely 8 hours
in order to have enough requests visible to be able to find
an efficient solution to the scheduling problem.

Best regards,
Alan

-- 

  Alan Scheinine
  5010 Mancuso Lane, Apt. 621
  Baton Rouge, LA 70809

  Email: alscheinine at tuffmail.us
  Office phone: 225 578 0294
  Mobile phone USA:   225 288 4176  [+1 225 288 4176]



More information about the Beowulf mailing list