Beowulf in a Box

Simon Thorpe thorpe@cerco.ups-tlse.fr
Sun, 27 Sep 1998 06:00:46 -0400


>On Sat, 26 Sep 1998, Kragen wrote:
>> http://www.pobox.com/~kragen/sa-beowulf/
>
>This is what I was hinting at in various conversations over the last
>few weeks.  It is almost a solution for that guy who wanted an SBC
>Beowulf -- it could probably be easily adapted for that sort of
>environment.

As one of the people who's been pushing for the MultiCPU PCI board idea, I
must say that I also think that these boards could potentially be a good
choice for someone wanting to be a compact (and cost efficient) beowulf
system.

>It looks like it's about an order of magnitude better on
>price-performance -- for integer stuff -- than the traditional Beowulf
>approach.  For $14,000, you could get five six-processor boards and a
>PC with five open PCI slots and have a 30-233-MHz-processor Beowulf
>with a gigabit backbone.

Actually, the plans have now evolved and it will now be possible to put 8
CPU modules on a single PCI board. That means up to 40 CPUs in a single PC.

One feature that I think is worth stressing is the total memory bandwidth
of such a system. Since each processor has its own private 66 MHz 32 bit
SDRAM memory bus, each board will have a memory bandwidth of 8 * 66MHz * 4
= 2.1 Gigabytes/second. As Kragen pointed out, you can put 5 boards in some
PCs, making 10.5 Gigabytes/second, and if you put lots of boards in a PCI
backplane, the sky's the limit...

Furthermore, the scope for improving interprocessor communication is great.
While the initial boards will use the host's 33 MHz 32 bit PCI bus for
interboard communications, the design of the board means that the 8
processors on the board will be attached to two separate PCI buses, which
will be able to run independently of the hosts PCI bus. That means that
total interprocessor bandwidth for a 5 board system would be 5 x 2 x 132
Mbytes/second = over 10 gigabits/second (although to achieve this, you'd
have to distribute the tasks between the different processors in a very
intelligent way, such that the CPUs that have to communicate the most are
always on the same local PCI bus).

In addition, there are indeed plans to produce base boards with 66 Mhz or
64 bit PCI bus interfaces (though not yet, I should stress...)

>I'm interested to hear other people's comments.  William Rankin
>commented that such things have never gone anywhere, and I'm curious to
>find out why.

My feeling is that the main reason that previous specialist hardware system
never really took off is that the programming environment was typically too
complex and idiosyncratic, and keeping up with microprocessor development
was too difficult. But it seems to me that now, with development
enviroments like Beowulf and Extreme Linux becoming widespread, this sort
of approach could well be considerably more viable.

Course, maybe I'm biased ;-)

Simon