turbolinux.com (product: enfusion)

Erik Paulson epaulson at students.wisc.edu
Thu Aug 10 09:31:27 PDT 2000


On Thu, 10 Aug 2000, Indraneel Majumdar wrote:
> Date: Thu, 10 Aug 2000 09:33:11 -0700 (PDT)
> To: Erik Paulson <epaulson at students.wisc.edu>
> From: Indraneel Majumdar <indraneel at www.cdfd.org.in>
> Subject: Re: turbolinux.com (product: enfusion)
> 
> Hi,
> 
> The reason I asked was that TurboLinux's enfusion is supposed to do that.
>

I don't think that parametric processing is a term that has really caught
on yet.
We call it high-throughput computing....
 
>
> They say you don't need to rewrite code. So if it is embarassingly
> parallel how is it different from pvm? (I'm a bio student with no maths
> background)

PVM is a message-passing library. Embarassingly parallel is a class of
problems. You can use PVM to solve embarassingly parallel problems, 
but you can oftentimes get away with a simpler solution.

(Incidently, around here we're supposed to use "naturally parallel" instead
of "embarassingly parallel" - I guess the funding sources aren't always so
hip on anything with "embarassingly" in the description)

> Is it that the same program is run with different initial
> parameters on different proessors and then you guess an approximate
> result?

No, you run it multiple times with different parameters looking at all the
answers. 

For example, one of our users does engine simulations. He's got a 
program that simulates one stroke of his engine that takes about an hour
to run. What he needs to know is what mixture of fuel and air gives the
best results, so he's going to experimentally try them all (I think he's
going
to try like 5,000 different combinations or something)
enfusion (or in our case Condor) will run them all, and give him the answer
in 5000/<number_of_processors_availble> hours. This all comes with no 
changes to his code, so it basically scales right up with the number of
machines availble (something I'm sure a number of people on this list
wish their problems did :)

> If so, then how does one ensure that the individual runs are
> shorter than the whole? You'll probably need to chose the initial
> parameters carefully for that, so how is that done? I guess you'll have
> to
> know the program algorithm to do that.

Yup. I would have no idea how to choose the parameters for the engine run.
(I just turn the key and go in my car)

> and I assume that every program
> has
> a different algorithm. Are there things like generic algorithm analysers?
>

No, and there are boring-cs-theory reasons why there aren't.
 
> 
> I'm into bioinformatics, and am trying to find out whether things like
> enfusion (which TurboLinux targets at protein modelling) might be created
> inhouse. So I need to know how it works. Can you explain or give me any
> links for detailed answers?
>

You can create a VERY basic system with {r,s}sh and perl in 20 minutes. 
It gets harder as you start adding features :) 
 
-Erik






More information about the Beowulf mailing list