Beowulf vs. MOSIX

Robert G. Brown rgb@phy.duke.edu
Tue, 8 Jun 1999 12:20:45 -0400


On Mon, 7 Jun 1999, Szwedyk, Peter wrote:

> It seems to me that for business applications, MOSIX might be a better way
> to go as a quick and easy way to take advantage of clusters.  With its load
> balancing and transparent process migration, even existing serial
> applications should be able to take advantage of the power of clusters.
> With Beowulf, on the other hand, one must parallelize the code in order to
> see any improvement in performance.
> 
> Is this assessment accurate?  Any comments?

For the appropriate class of problems, this is both true and
intelligent.  Mosix turns a cluster into a virtual SMP machine and
brings "parallel clustering" to embarassingly coarse grained (basically
multiple serial) applications without any need to write parallel code or
write the shell wrappers that one needed to manage these applications
beforehand.  It is going to be a godsend for many, many classes of
problems -- for the first time, the network really >>is<< the computer,
to borrow a really very fine line from Sun.

However, there are still many other classes of problems for which MOSIX
is not the answer.  For some of these, real parallel computation is key.
For others, parallelized access to data is key, and MOSIX doesn't
necessarily eliminate a server bottleneck in the data stream.  MOSIX
will certainly offer instant and cost-beneficial gratification to many,
many organizations seeking to utilize wasted compute resources
transparently, but it is only one piece in a bigger puzzle.

I think that the ultimate compute environment in medium to large
businesses will evolve into something that has one or more "true
beowulf" cores, a large and amorphous cluster (which will include most
desktop workstations) running MOSIX as you describe, a parallelized
filesystem and server construct to provide load-balanced, parallelized
access to a large data warehouse, and tools to facilitate using all of
these various components transparently (with MOSIX being just one of
those tools).  A user might seek to run a set of single threaded
accounting processes that are MOSIX distributed but gets data in
parallel with other applications accessing the same data space.  Another
user might run a complex SQL command to build a dataset, with parts of
the command run (transparently) in parallel.  Still another might be
building a presentation that involves complex rendering and
visualization of data landscapes, where the data is accessed in parallel
from the parallelized filesystem, processed and rendered in parallel on
a beowulf core, and displayed on a particular workstation (or even a
collection of distributed workstations), again totally transparently.

     rgb

> 
> ---
> Peter Szwedyk
> Goldman, Sachs & Co.
> Securities Lending Technology
> One New York Plaza, 48th Floor
> New York, NY  10004
> Phone: 212-357-8105 | Fax: 212-428-1405
> 
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb@phy.duke.edu