[Beowulf] open mosix alternative

Kenneth Strouts kstrouts at fastmail.fm
Tue Jul 8 04:15:31 PDT 2008


No since the cluster freezes when a node crashes, so there's no chance
to start such a package.

Kerrighed doesn't support addition and removal of nodes from the cluster
yet, and there isn't much scope for dealing with node failure until it
does.

On Tue, 8 Jul 2008 07:49:50 +0200, "Jon Aquilina"
<eagles051387 at gmail.com> said:
> cant that be used in conjunction wiht other packages taht will power down
> a
> node which fails?
> 
> On Fri, Jul 4, 2008 at 10:39 AM, Kenneth Duncan Strouts <
> K.D.Strouts at sms.ed.ac.uk> wrote:
> 
> > Hi Jon,
> >
> >
> > Quoting Tony Travis <ajt at rri.sari.ac.uk>:
> >> Although Kerrighed looks very promising, it is also quite fragile in our
> >> hands. If one node crashes, you lose the entire cluster. That said, the
> >> Kerrighed project is extremely well supported and I believe it will be a
> >> good alternative in the near future.
> >>
> >
> >
> > We found that with Kerrighed, one node crashing sees the whole cluster go
> > down.  The following is output to kern.log before the cluster dies.
> >
> > Jul  2 13:57:03 nodeC at kghed kernel: TIPC: Resetting link
> > <1.1.2:eth1-1.1.3:eth1>, peer not responding
> > Jul  2 13:57:03 nodeC at kghed kernel: TIPC: Lost link
> > <1.1.2:eth1-1.1.3:eth1> on network plane B
> > Jul  2 13:57:03 nodeC at kghed kernel: TIPC: Lost contact with <1.1.3>
> >
> > From the Kerrighed mailing list (Louis Rilling);
> >
> > "Indeed, Kerrighed does not tolerate node failures yet. We have no precise
> > date
> > for this, and giving a date right now would be meaningless. The first step
> > for
> > us is to support dynamic cluster resizing (IOW live node additions and
> > removals), and we've just started working on it. We will work on node
> > failures in a second step."
> >
> > It seems they are working on this, and on a new framework for configurable
> > process scheduling.  Probably Kerrighed will provide a good alternative in
> > future.
> >
> > Kenneth
> >
> >
> >
> > --
> > The University of Edinburgh is a charitable body, registered in
> > Scotland, with registration number SC005336.
> >
> >
> >
> >
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf
> >
> 
> 
> 
> -- 
> Jonathan Aquilina
-- 
  Kenneth Strouts
  kstrouts at fastmail.fm

-- 
http://www.fastmail.fm - The way an email service should be




More information about the Beowulf mailing list