[Beowulf] What services do you run on your cluster nodes?

Perry E. Metzger perry at piermont.com
Mon Sep 22 17:54:36 PDT 2008

Greg Lindahl <lindahl at pbm.com> writes:
> On Mon, Sep 22, 2008 at 03:04:33PM -0400, Perry E. Metzger wrote:
>> If a machine isn't sending out more than, say, 20,000 email
>> messages an hour, you won't notice the additional load Postfix puts on
>> a modern machine with any reasonable measurement tool.
> Ah. So if I have 3,000 nodes, running an extremely tightly coupled
> app, and each one does postfix once per 30 seconds (that's
> 100/second), and each time one wakes up it causes the entire cluster
> to freeze for 1 quantum (1/250 of a second), how much work does the
> cluster get done?

If you're sending out a status email every 30 seconds, something is
very wrong with your setup. Presuming you're only sending them out if
there is a status problem, under normal circumstances you're only
sending out one message once a night.

By the way, if you really can't afford for things to "go away" for
1/250th of a second very often, I have horrible news for you: NO

Why is this? Because the System Management BIOS (which most people
don't know about, but it is there) makes your machine go away for
quite some time quite frequently. You're going to need custom
hardware, clearly.


