[Beowulf] open mosix alternative
eagles051387 at gmail.com
Mon Jul 7 22:49:50 PDT 2008
cant that be used in conjunction wiht other packages taht will power down a
node which fails?
On Fri, Jul 4, 2008 at 10:39 AM, Kenneth Duncan Strouts <
K.D.Strouts at sms.ed.ac.uk> wrote:
> Hi Jon,
> Quoting Tony Travis <ajt at rri.sari.ac.uk>:
>> Although Kerrighed looks very promising, it is also quite fragile in our
>> hands. If one node crashes, you lose the entire cluster. That said, the
>> Kerrighed project is extremely well supported and I believe it will be a
>> good alternative in the near future.
> We found that with Kerrighed, one node crashing sees the whole cluster go
> down. The following is output to kern.log before the cluster dies.
> Jul 2 13:57:03 nodeC at kghed kernel: TIPC: Resetting link
> <1.1.2:eth1-1.1.3:eth1>, peer not responding
> Jul 2 13:57:03 nodeC at kghed kernel: TIPC: Lost link
> <1.1.2:eth1-1.1.3:eth1> on network plane B
> Jul 2 13:57:03 nodeC at kghed kernel: TIPC: Lost contact with <1.1.3>
> From the Kerrighed mailing list (Louis Rilling);
> "Indeed, Kerrighed does not tolerate node failures yet. We have no precise
> for this, and giving a date right now would be meaningless. The first step
> us is to support dynamic cluster resizing (IOW live node additions and
> removals), and we've just started working on it. We will work on node
> failures in a second step."
> It seems they are working on this, and on a new framework for configurable
> process scheduling. Probably Kerrighed will provide a good alternative in
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf