IBM goes grid

Robert G. Brown rgb at phy.duke.edu
Thu Aug 2 08:17:54 PDT 2001


On Thu, 2 Aug 2001, Eugene Leitl wrote:

>
> Buzzword content: high.
>
> -- Eugen* Leitl <a href="http://www.lrz.de/~ui22204/">leitl</a>
> ______________________________________________________________
> ICBMTO  : N48 10'07'' E011 33'53'' http://www.lrz.de/~ui22204
> 57F9CFD3: ED90 0433 EB74 E4A9 537F CFF5 86E7 629B 57F9 CFD3
>
> http://www.nytimes.com/2001/08/02/technology/02BLUE.html?pagewanted=print
>
> AUG 02, 2001
>
> I.B.M. Making a Commitment to
> Next Phase of the Internet
>
> By STEVE LOHR
>
> I.B.M. is announcing today a new initiative to
> support and exploit a technology known as grid
> computing, which the company and much of the
> computer research community say is the next
> evolutionary step in the development of the Internet.

Grid computing is certainly interesting and an idea I've kicked around
for implementation on e.g. Duke's campus, where I've for years been a
well-known bottom feeder seeking cycles.  It isn't really off topic at
all for the beowulf list, since beowulfs (yes, plural -- Duke will have
something like 16-20 distinct clusters of various sorts by the end of
this year -- physics alone will have at least four, EE three, statistics
has its first (but not its last), chemistry has one or two, and even the
medical center is getting in on the act) are a primary source for
unharvested cycles.

A group might well get a cluster for its own research but only keep it
busy 60% of the time, with the rest dedicated to digesting results,
writing papers, teaching classes, writing proposals, writing new code to
start a new research cycle.  Even assuming a smallish cluster (aggregate
float rate of perhaps 5-10 GFLOPS) there could be 10^17 wasted
(potential) float operations a year on such a cluster.  The overall
campus might produce 10^18 or more cycles for harvest by bottom feeders
(embarrassingly parallel Monte Carlo jobs like mine, for example:-)
distributed by a suitable grid program.

There are a number of problems to solve in implementing this idea, of
course.

The first one that comes to mind is overwhelmingly security.  Duke is a
very security-conscious institution.  This for good reason -- failures
in general campus network/systems integrity cost an immense amount of
human labor, personal agony, and real money every year. (He says while
gritting his teeth and deleting perhaps the 50th simcam-generated
message from his mailbox this morning. It has been really interesting to
see the virus sweeping from Korea into the US then over to Italy and
Russia in a perverse sort of way.  It has also been disappointing to see
that my name is in the addressbook of so many obvious Windoze users,
likely from this very list -- hey folks, linux is IMMUNE to crap like
this except for the annoyance factor;-)

Even if we assume the invention of something like a gridd daemon that
can build a true virtual space sandbox on a system (not impossible,
although a daunting task, really) with suitable task scheduling,
priority, migration, and authentication components (preventing, we'll
assume, any possible crack-one-box-you've-cracked-them-all propagation
of Evil across LAN/domain boundaries and finding a crowd of systems
administrators and users outside your door some morning carrying
pitchforks and torches) one runs significant risks.  Really significant
risks.

Having been blessed with the opportunity to present cluster computing to
a group of high end law enforcement officers (e.g. FBI and various SBI's
plus even some international cops) and listen in turn to what they are
dealing with in cybercrime, I know for sure that child pornography is a
top priority with them.  I also know it is like a nuclear device or a
bag of cocaine -- there is literally know way to handle it that isn't a
felony if you aren't a law enforcement officer.  That is, do NOT pick up
that baggie and carry it to a policeman -- call the police without
handling it.  True, a sane DA won't prosecute you.  But they might.
They certainly could if they wanted to.

If ANYBODY successfully puts e.g. kiddieporn encryption and delivery
systems on the grid, they make instant felons out of all grid members.
Even if the FBI and attorney general (sanely) choose not to prosecute
anybody but the primary perpetrator of the act if/when they catch on,
the possibilities for a truly awe-inspiring civil suit (where all issues
of sanity go out the window in the face of immense lucre) are endless.

And this is just one scenario.  I'm sure that there are dozens of
others.  Then there is my natural skepticism about building a sandbox
sufficiently isolated to prevent chain reaction cracks or virus
propagation.  Imagine the absolute chaos if anybody ever manages to
insert cross-architecture evil code into the binaries distributed by
SETI.  It would bring the entire concept of grid computing to a
shuddering halt, not to mention bringing out the angry mob wielding
instruments intended to cause pain and suffering.

Still, viewing a university domain as being something of a sandbox to
start with (suitable outer-perimeter port blocks, moderate trust and
accountability between departments and university entities) recovering a
few billion-billion floats a year is an appealing and tempting prospect.
Especially if the tool is safe enough to be placed on e.g. general
purpose workstation clusters (Duke has several hundred systems like this
all over campus) which often sit de facto idle for 95% of the time
measured in average CPU consumption.  With reliable scheduling and
priority controls (probably managed by a central resource daemon) this
is a virtual resource worth literally hundreds of thousands of dollars.

It is interesting to think about how one would make a beowulf (in the
softer definition of a named cluster, not necessarily constrained to but
definitely including the strict/scyld/dedicated sense) a member of a
distributed "super" cluster with existing tools and minimal work.  Even
doing it inside our own department is going to be challenging.

Seems like a useful project for some grad students in CS departments
that concentrate on cluster computing, e.g. Clemson.  Walt?  Anybody?

    rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu







More information about the Beowulf mailing list