[Beowulf] What services do you run on your cluster nodes?

Ashley Pittman apittman at concurrent-thinking.com
Tue Sep 23 02:07:49 PDT 2008


On Mon, 2008-09-22 at 20:54 -0400, Perry E. Metzger wrote:
> 
> By the way, if you really can't afford for things to "go away" for
> 1/250th of a second very often, I have horrible news for you: NO
> COMPUTER WILL WORK FOR YOU.

To a large extent you are actually correct, this is one of the reasons
why building "large" clusters is hard.  If you just plug them together
and install $mpi the performance will suck for this very reason.

The thing to remember is "noise" or "jitter" probably doesn't affect
many people, it's effects become noticeable as job size (not cluster
size) increases and the effects are non-linear.  You can probably run a
128 node cluster and not notice it at all, beyond this scale however and
time spent turning things off it time well spent.  In fact in my
experience noise becomes significant not just with size but as as square
of size.

Note that it's not just the "OS" fluff which causes problems and turning
things off doesn't get you anything like 100% of the way there, some
deamons have to run (your job scheduler) so all you can do it tune them
or re-code them to use less CPU and some kernel versions are pretty bad,
one version of Red-Hat was effectively un-usable on clusters because of
kscand.

Ashley Pittman.




More information about the Beowulf mailing list