Help needed please !!!

Robert G. Brown rgb at phy.duke.edu
Thu Jan 4 13:56:52 PST 2001


On Thu, 4 Jan 2001, Krishna Prasad wrote:

> Hello anyone,
>
>           I am doing my Btech Final year project in clustering.
> I have gone through various projects given in beowulf site and Still I
> coudn't make out the basic idea of how to split a job into parallely
> executably form.  I plan to do my project on linux.
>            What I plan to do is to split a process involving lot of
> iterations into parallelly executable form and execute in the different
> nodes of the cluster.  Please someone guide in simple terms as to how to
> split the process or where to find the information about it.  Can it be
> completed within 2 months time as it is my limit. Which language is suitable
> for this?  What program or job can be used as input for this?
>
>           Also tell me how could I show the time taken for execution when a
> process is executed serially and in a cluster.
>           What we have is 3 nodes with lan connection in our college lab
> each of having a  pentium processor with linux loaded in them.
>
> Please reply soon as my time is limited.

A tough problem as your questions indicate that you have a lot to learn.

"Parallelly executable form" isn't automatic -- it depends on the
specific task and generally cannot be done by any language or toolset
(although there are a very few exceptions, maybe for certain common
library operations).

So your first job is simple.  Write your program to run on just ONE
processor in the language of your choice, as long as it is C or Fortran.
I personally avoid Fortran like the plague on humanity that it is, but I
don't want to bias you;-) Note that your project requires that you write
a serial version of the code anyway so this time is definitely not
wasted.

Second, you have to identify the parallelizable components of your task.
Hopefully it has some, however, NOT EVERY TASK DOES!  A word processor,
for example, isn't too parallelizable.  Neither is a file management
tool or anything else involving a serial bottleneck as the rate limiting
factor.  Usually parallelizable tasks involve:

  a) A lot of work.  Why bother parallelizing small tasks?

  b) MAJOR chunks of the work that can be split into pieces that can be
done at the same time (in parallel) with or without communications
required between the distinct pieces and/or a controlling master
process.

  c) In a kind universe, those communications will be few, infrequent,
regular.  In a perfect universe, they'll only occur when the subtask
starts up and when it finishes and will take hardly any time at all
compared to the amount of time spent on the task.  In a very bad
universe every little step of the parallel subtasks will require
communications with all the other parallel subtasks and those
communications will take a long time compared to the amount of work
done.

In order for your project to have a prayer of actually running FASTER in
parallel (which is by no means a given) on the hardware you describe,
you had better hope that your task lives in a fairly kind universe.
"Embarrassingly parallel" would be ideal, "coarse grain parallel" would
be acceptable.  The first means that you can arrange the code so that it
is run as N independent tasks that run for a long time with a master
that just farms out independent tasks (like the DES or SETI project).
The second means that there is SOME communication required between the
subtasks but it is "small" compared to the computation time of the
subtasks.  A good universe, as I said.

If you cannot arrange your code so that the task is at worst coarse
grain parallel, stop, as your project will probably run slower in
parallel than as a single serial task.

Next, figure out what FRACTION of your overall task we're talking about
here.  To be specific, if your overall task takes five minutes to
complete, and one minute of that is spent doing work that can be done in
coarse grained parallel, and you acheive perfect speedup on that one
minute of work across three systems, your job goes from completing in
five minutes to completing in four minutes and twenty seconds.  Faster,
true, but not very impressive and probably not worth the work.  On the
other hand, if it takes five minutes to complete and four and a half of
those minutes are doing perfectly parallelizable work, you might run the
job in as little as two minutes on three hosts.  Not too bad.

This is known as "Amdahl's Law" (when you finish working out the math).
If you haven't ever heard of it, then it is time to go to the library
and do some work before you even think of starting the project as it is
at the heart of parallel programming.

Finally, presuming that your task can be organized so that it spends
most of its time doing coarse grained parallel work (so you can see
decent parallel speedup when you are done on only three nodes), you pick
a parallel programming library, usually PVM or MPI and learn how to
manage parallel task management and interprocessor communications using
the library's function calls.  At this point it would be very useful to
get BOTH a book on your choice and look over examples in the
distribution directories and in the books to see how to proceed.  Port
your single threaded task to parallel form.

To time is simple -- wrap the core routine in which all the parallel
work is done in a simple timer call.  There are examples of simple
timing code in e.g. the cpu-rate code on the brahma website.

If you work hard on all of this you can learn a huge amount about
parallel programming with your project.  The above isn't intended to
discourage, rather to provide a roadmap for the learning process.

To learn more about all of this you might want to look over several of
the resources currently provided on the brahma website:

  www.phy.duke.edu/brahma

There is a draft online copy of a book on beowulfery that explains,
among other things, Amdahl's law and parallel scaling.  There are
several online "talks" on beowulf-type parallel computing and parallel
task design.  There are some benchmarking tools with timing routines you
can snitch.  One day I hope to have some simple example parallel codes
up to help out novices as well.  In the mean time,

 Good Luck!

      rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu







More information about the Beowulf mailing list