[Beowulf] best archetecture / tradeoffs
skeith at deterministicnetworks.com
Fri Aug 26 08:56:02 PDT 2005
Hello everyone, I am Seth Keith, this is my first mailing.
I am new to the distributed computing thing, but despite this I find
myself constructing a Beowulf system. I have put together a few
different types and experimented and read enough to realize it is time
to solicit outside opinions. I really hope to get some good advice. If
you have the time please help me out...
My requirements are easy, I think, since my program is already broken up
into a lot of different programs communicating via STDIN/OUT. I
benchmarked and found my problem is CPU intensive. The overall data
transfer is small, but all the different parts need to be assembled
before the final pass on the data. The final pass cannot be broken up,
but the final pass is fast. So my model is input data -> break up into
N workers -> assemble results -> process -> done.
I need advice on a few of the tradoffs:
1) diskless vs disk
I am thinking diskless is better. I don't worry about network traffic as
much as power consumption, overall node cost, reliability, and ease of
management. My nodes are all identical, so I figure diskless, right?
Well I am having a few problems...
I still don't know exactly about swap. One of the clusters I set up was
an NFS mounted root file system that did something with swap to
/dev/loop0, but I don't really understand that, is the swap going onto
the nfs drive or is it just back into memory? What is the best ( fastest
) way to handle swap on diskless nodes that might sometimes be
processing jobs using more than the physical RAM?
Also, is it really true you need a separate copy of the root nfs drive
for every node? I don't see why this is. I have it working with just one
for all nodes, but am I missing something here?
2) message passing vs roll yer own
I have played with a few different packages, and written a bunch of perl
networking code, and read a bunch and I am still not sure what is
better. Please chime in:
- what is the fastest way to run perl on worker nodes. Remember I
don't need to do anything too fancy, just grab a bunch of workers, send
jobs to them, assemble the results, send the results to another worker,
etc. I don't need to broadcast to all nodes or anything else.
- what is the easiest way to do it. I wrote the whole thing in perl
already, and I was not really impressed with the speed or reliability.
Certainly this was at least partially programmer error, but my question
stands, what is the easiest way to reliably control a cluster of worker
nodes running different perl programs, and assembling the data. This
includes load balancing.
- I saw some information on clusters that were linked in the kernel
and acted as a single machine. Is this a working reality? How does the
performance of such a system compare with message passing for dedicated
processing such as my own.
- I was playing with MPICH-2, is this better than LAM? What about
other message passing libraries what is the best one? any with direct
hooks into perl?
- how fast is NFS and RSH. If I were to change the code so it works
with a NFS mounted file instead of STDIN/OUT and I use RSH to
communicate how would the speed compare with message passing? with
direct perl networking?
3) Distribution and kernel
I create my NFS system by copying directories off my RH9 distribution. I
had lots of problems and could never get everything working. I think it
would be loads easier if I could find a standard distribution image
already constructed somewhere out there... I don't really care what
distribution as long as I can run perl.
I keep seeing people advising against the NFS root option and advocating
ram disk images. Opinions here? Where can I get ram disk images? I would
be nice to download a basic complete ram disk image, that boots with
root rsh working already.
Well I guess that is enough for one day. Thank you for taking the time
to read this email. If you have the time please send me your opinions on
More information about the Beowulf