Who runs large capability jobs

Greg Lindahl glindahl at hpti.com
Thu Jul 27 07:33:08 PDT 2000


> I agree there are very large set of machines out there.
> What I said was that most jobs run in parallel are not on
> those size systems. Practically no one but the government
> own that size of system so by default the rest of us are
> running on smaller systems.

Sure. You said that in reply to my pithy comment, but it actually has
nothing to do with my pithy comment. Fine. Let's start a new, different
discussion.

There are people who run large, and there are many more people who want to
run large. An example is all the systems at the top of the Top500 list. The
ASCI guys *frequently* use their entire machines on a single job. Another
example would be the site labeled NAVOCEANO. Alan Wallcraft, a single user,
uses a majority of their compute capacity, and he runs on entire machines
whenever he can, because he's trying to run 1/16 degree and 1/32 degree
global ocean models. These are the people who do it today.

The people who want to do it today are all the folks working on grand
challenges. The NSF, for example, moved from a model where lots of
researchers got little slices of their supercomputer time to a model where
the little guys stay home and big projects get big pieces of the machine.
The NSF Terascale procurement, results to be announced soon, is a 5 TF
(peak) machine, and it's going to be used that way, too. So the general US
researcher is going to have access to that kind of resource. And the top
contenders for the bid? IBM SP2 and Compaq SC, both non-commodity clusters.
And NCSA bid a commodity cluster of some kind.

Commercial folks are headed in that direction. You can decide for yourself
if you think that embarrassingly parallel sites like the 1,000 cpu genetic
algorithm site is "a single job", or if Google's cluster is running "a
single job". But I assure you that the protein folding guys are headed to
1,000+ cpu runs. And they have hundreds of millions of dollars to spend. On
computers.

> The large systems are an exception,
> not the rule.

Right. Whatever. I don't care what the rule is. I do what I do, you do what
you do, we compare notes on this mailing list so we both can learn. Arguing
or discussing what's typical is about as interesting as the periodic
comp.arch flamewar "Who cares about anything but x86, only x86 is
important"... "who cares about x86, microcontrollers ship in 10 times the
volume"... "well, I want to talk about Alpha, because it's best for my
app"...

-- g





More information about the Beowulf mailing list