new SGI Origin & Onyx 3k?

Wed Jul 26 21:56:16 PDT 2000

On Wed, 26 Jul 2000, Kragen Sitaker wrote:

> "W Bauske" bloviates:
> > Exactly how often do you think people want to run 1000 nodes on
> > a single problem? When was the last time you did that?
> > 
> > Realistically, the majority of parallel jobs are in range 4-128 
> > processors, depending on what you want to do and how the code scales. 
> > SGI's do fine in that space. So do other non-beowulf systems or
> > no one with sense would buy them.
> 
>    "I think there is a world market for maybe five computers." 
>        - Thomas Watson, Chairman of IBM, 1943
> 
> Presumably people were getting by with tabulating machines and desk
> calculators when he made this remark.
> 
> He probably would have been right.  Except that computers vastly
> increased the market for calculations by decreasing their cost.
> 
> Numerically, the majority of parallel jobs are probably "in the range
> of" 2 processors, and probably always will be.  That doesn't make
> 4-processor machines or 1000-processor machines useless.

I'm not sure that I'd make >>that<< categorical a statement.  Remember,
most single CPU computers already use 2-4 processors in parallel -- CPU,
video, controllers... a lot of this is just hidden firmware based
parallelism.  We are also at a relatively primitive stage of parallel
hardware and software development.  There have been hardware development
efforts described on this list in the past that might drop the cost of
"a processor" to put in a system to $5 or $10, so that the "power" of a
future system is completely derived from multiprocessor parallelism
instead of clockspeed increase, in a beowulfish architecture.  In the
future, upgrading a computer may be snapping another "node" into a slot
on a commodity bus/switch with a master node handling I/O and primary
interface.

I also don't have any idea what "the majority of parallel jobs" will
consist of in the future.  Of course, I'm not even sure what it is
now;-)

For example, let us all not forget that even now there are whole LARGE
categories of code/problems that scale "to infinity and beyond".  Well,
not REALLY infinity, but anything in the embarassingly parallel (EP)
category (e.g.  SETI, RC5, a heck of a lot of importance sampling Monte
Carlo, rendering, image processing, etc.) can be scaled as high as you
like, sometimes using a floppy disk and a pair of rollerblades as a COTS
IPC/job management channel.  I'd cheerfully use 1000 systems if I could
afford them (including their care and feeding) to do my Monte Carlo.
I'd just get work done in 1/10 the time it would take me using 100
machines and 1/100 the time it would take me using 10 -- scaling is no
problem even with 10BT networks.  The only thing that limits my scaling
now is the tools I use to manage runs, and I'd cheerfully modify them to
work on 1000 nodes if somebody wants to give me a couple or three $M for
the hardware and the nontrivial infrastructure...;-)

There are far more 1000 node problems (real or potential) than have been
mentioned on the list.  There is at least one 1000 node GA system (at
Stanford, IIRC) although I could scale my own GA's up that far and have
tremendous fun if somebody pops for the boxes.  Der Ubercracker would
love to have 1000 nodes or even more, as many encryption problems are
just a matter of how fast you can work through the keyspace and can be
run EP.  Rendering (making movies like Toy Story, Dinosaur, etc.) can
burn 1000 or 10,000 nodes if somebody will pay for the suckers.  More
nodes = faster development time = more money.  Indeed, rendering alone
might one day be THE cycle consumer on a desktop workstation -- in that
future day where our operating systems and GUI's present an entire 2d/3d
rendered view in a virtual space, with tools that make authoring an
animated rendered GUI a straightforward matter.  That "desktop"
rendering might well occur on a whole array of cheap parallel processors
rather than a single expensive one, interconnected with a COTS network
(built into the system) and running a descendent of bproc -- a desktop
beowulf.

So there are (at least) TWO categories of beowulf user that are
interested in very large beowulfish systems.  One is the class that is
using beowulf-type architectures for HPC problems, like those that Greg
has referred to.  These users have LARGE problems with some nontrivial
granularity and require high speed processors and high end IPC channels
to profitably scale out as far as possible.  They buy/build systems that
are inexpensive compared to dedicated boxes (usually) but are still
quite costly compared to the "standard beowulf design" of the
Intel-du-jour PC node plus switched 100BT.  Their design decisions are
dominated by performance and scaling, not cost per se except the factor
of ten or so cost advantage relative to dedicated iron.

The other class is those folks doing relatively low-tech EP work, who
have always been able to scale out to fill their available cluster
resources.  There are likely a lot more of the latter than the former,
although their problems are not necessarily as sexy and don't get as
much grant money to support them as they are "too easy" (forgetting the
fact that they may be easy computationally but certainly aren't
necessarily "done").  These users often get by with systems that are
cheap even as beowulfs go on a cost/node basis -- stripped Celeron nodes
on cheap hubs or switches.  Their design decisions are dominated by
economics -- how can I get the most work done for my inevitably limited
budget.

In between are the rest of the beowulf users, who have a wide range of
possible constraints that limit the size of the beowulf they can use,
usually with respect to their fixed budget but not always.  In a lot of
cases their problems may well scale only to <O(100) nodes, at least
given their budgets.  Still, if you gave those same folks 1000 nodes and
a team of programmers, postdocs, and grad students to help out, I'll bet
a whole lot of them would find some way of running 10 EP independent
threads of O(100) scaled calculations.  Or something...the resource
would be used.

To conclude, while many people are indeed limited in the size of the
beowulf they can use by the scaling characteristics of their problem and
hardware, there are ALSO many, many people who are limited primarily by
the size of their pocketbook, at least until their budgets are increased
by orders of magnitude over where they are now and they approach their
problem's real scaling limits.  Like me. 

(I'd pass a hat around to collect just a wee bit from everybody if only
I could:-).

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu