[Beowulf] Re: Pretty High Performance Computing

Mark Hahn hahn at mcmaster.ca
Wed Sep 24 15:20:22 PDT 2008


> that, perhaps serendipitously, these service level delays due to nodes
> not being completely optimized for cluster use don't result in a
> significant reduction of computation speed until the size of the
> cluster is about at the point where one would want a full-time admin
> just to run the cluster.

no, not really.  the issue is more like "how close to the edge are you?"
it's the edge-closeness (relative to cluster capabilities) that matters.

that is, if your program has very frequent global synchronization,
you're going to want low jitter.  yes, exponentially more so as the 
size of the job grows, but the importance of the issue also grows 
as your CPU increases in speed, as your interconnect improves, etc.

similarly, if you have an app which is finely cache-tuned,
it'll hurt, possibly a lot, when monitoring/etc takes a bite out.

> don't worry about these service details too much, just do your work
> knowing that you're maybe losing 2% speed (this number is a total
> guesstimate).

2% might be reasonable if you're doing very non-edge stuff - 
for instance, a lot embarassingly parallel or serial-farm workloads
that don't use a lot of memory.  it's not that those workloads are 
less worthy, just that they tolerate a lot more sloppiness.

again, it's the nature of the workload, not just size of the cluster.



More information about the Beowulf mailing list