Beowulf in a Box

Douglas Eadline deadline@plogic.com
Mon, 28 Sep 1998 11:17:48 -0400


On Sat, 26 Sep 1998, Kragen wrote:

> On Sat, 26 Sep 1998, Kragen wrote:
> > http://www.pobox.com/~kragen/sa-beowulf/
--snip--
> 
> I'm interested to hear other people's comments.  William Rankin
> commented that such things have never gone anywhere, and I'm curious to
> find out why.

While I encourage such efforts, I have my reservations as to
the economic success of such an approach.  Let me explain a bit.
Please consider me the "devils advocate" so that I can make your idea
stronger by providing my reservations based on my experience.

In the early days, there were these things called INMOS Transputers.
I am sure there are many research labs still using these.

The Transputer was nice from many standpoints, hardware context
switching, low latency hardware based communications, small size
good performance.  Although the T series had no memory protection
mechanism, there was no need for an OS on each node.  There was
quite an effort to write software (although INMOS did its best
to prevent this by insisting that we should all use OCCAM)
A C compiler was produced that included a nice
message passing interface.  There was even a virtual
channel library.


So what happened? Well (In my opinion) INMOS could not keep
up with the other CPUs of the time (Although when the
Transputer was first introduced it was very fast).
There were many "turn key" systems based on the Transputer.  
Failed promises were the first problem of the Transputer,
but I think it's ultimate demise (other than an embedded CPU)
came from the lack of acceptance of "niche hardware" by the 
mainstream.  Sure the embedded system guys, love this kind of
stuff, but I believe it is tough sell to get someone
to use niche hardware for following reasons:

1) single source hardware (possible overnight obsolescence)
2) support comes from a single vendor and is limited (manpower
   pool is very limited)
3) because of the single source nature an organization must
   make a large investment (of time and money) - this is the biggest
   problem.   

Contrast this with a complete commodity solution - where
an component has more than 1 source and is guaranteed to
to track new technology.  A growing base of software support
is available.  Systems can be-recycled. etc.

There is a large amount of comfort knowing that you are not
relying on a single person, company, or product to run your machine.  
Many people in the "high end" market have spent quite a bit of money
and time on systems only to have them become orphans because
of some external factor (a company decides to leave the market).
This has left a bad taste in the mouth of many people. I 
have overheard conversations like "never again will we invest in
some technology that is going to end up a dead-end"

Keep in mind, that the actual cost of adopting a new
technology is usually more than the cost of "the new technology"
(i.e more than just $/MIPS).  I believe that one of the
reasons Beowulf/Clusters are very popular is that they
are "plug and play" replacements (from a software
standpoint) for much more expensive machines and therefore, the cost to
adopt clusters is small.  If you can deliver "plug and play"
to a market segment, then $/MIPS is a good sell.   

To say Intel is behind ARM helps a bit, but not much. Intel
killed their own children(i860/960) and closed their 
supercomputer shop (except for custom machines).  

Like it or not, the "PC" is known concept, people are more
comfortable with things they know, than with better things
they do not know.  Of course this all goes back
to the job of a salesman to "make them comfortable".   
Which further increases the cost of the "better gizmo"
because now you need to fund an education process.

Finally, my guess as to the amount of work/cost involved to 
bring this idea market is rather large.  It sounds as though
there is still some hardware/software  to be developed. Performance
is uncertain, lots of assumptions.  Is there a business plan?
By the time all this gets worked out, a lot may change.

Now I do consider that if the hardware can provide good
performance with software compatibility (MPI, PVM) then
you would have a good "accelerator" product.  Positioned
on "top of" a commodity platform (which of course it will 
need to be) it may work out well.

BTW, there are many applications that do not require FP.  
We have some tools that can take hundreds of CPUs and
make them do amazing things.  Most of these are database/datamining
applications. The one thing I have found, however, is that
a clean simple design is best for using lots of CPUs efficiently.
i.e. the time it takes for any CPU to talk to any other CPU
is about the same for all CPUs.


Well there are some things to consider.  I tried to give
some objective experience that may help you "fine tune" your strategy. 
It is not my goal to say "this will not work (because I have no
idea if it will or not)", but rather, "push your idea a little".


Good Luck,


Doug Eadline



-------------------------------------------------------------------
Paralogic, Inc.           |     PEAK     |      Voice:+610.861.6960
115 Research Drive        |   PARALLEL   |        Fax:+610.861.8247
Bethlehem, PA 18017 USA   |  PERFORMANCE |    http://www.plogic.com
-------------------------------------------------------------------