Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

IBM goes grid

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Jakob Østergaard jakob at unthought.net
Sat Aug 4 13:20:39 PDT 2001


On Thu, Aug 02, 2001 at 12:14:46PM -0400, Greg Lindahl wrote:
> On Thu, Aug 02, 2001 at 11:17:54AM -0400, Robert G. Brown wrote:
> 
> > It is interesting to think about how one would make a beowulf (in the
> > softer definition of a named cluster, not necessarily constrained to but
> > definitely including the strict/scyld/dedicated sense) a member of a
> > distributed "super" cluster with existing tools and minimal work.  Even
> > doing it inside our own department is going to be challenging.
> > 
> > Seems like a useful project for some grad students in CS departments
> > that concentrate on cluster computing, e.g. Clemson.  Walt?  Anybody?
> 
> Legion, Globus, and Condor all did this ages ago, and others
> too. What do you think that's new that can be done?

Here's a few:

Automatic on-the-fly parallelization so that problems adapt to the current
state of the cluster/grid.

Sand-boxed execution of code so that you can make your resources available
to a broad range of outsiders without worrying too much.

With that comes security domains and code/data labelling, so that when you
submit problems that read "secret" data, those computations will not migrate
to "untrusted" sites  -  and problems originating from "strangers" will be
given priority below your "local" jobs.

When running on large clusters (or, the grid) you also want transparent error
recovery (where possible).  A single point of failure is not acceptable for
100.000+ nodes.  It seems that autonomous systems will save us from the single
points of both failure and contention.

Naturally, when running jobs over a network, things like speculative execution
and speculative data movement are interesting too - but the system must decide
when to do this, not the user.  For, as with parallelization, the user does not
know at code-time what may happen at run-time.

I will have a report (well, my thesis actually) and a partial implementation
ready in about a month.

-- 
................................................................
:   jakob at unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:




More information about the Beowulf mailing list