[Beowulf] openMosix ending

Michael Will mwill at penguincomputing.com
Tue Jul 17 12:21:26 PDT 2007


IMHO you don't need dynamic migration for embarassingly parallel applications as they can just be launched
on any available compute node directly and run there to completion. A simple queue system / scheduler 
like torque or similar will be enough to make sure to not run more than cpus are available on a give node
at the same time in order to get best throughput. Just throw your 100 parametrized runs into the queue,
and the headnode/scheduler will keep all available nodes busy until all work is done.

The hierarchical approach of classical beowulf works just fine for that.

Michael Will
Sr. Cluster Engineer
Penguin Computing
-----Original Message-----
From: beowulf-bounces at beowulf.org on behalf of Tony Travis
Sent: Tue 7/17/2007 8:03 AM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] openMosix  ending
 
Robert G. Brown wrote:
> On Mon, 16 Jul 2007, Jeffrey B. Layton wrote:
> 
>> Afternoon all,
>>
>> I don't know how many people this affects, but I thought it was
>> worth posting in case people are using openMosix. The
>> leader of openMosix, Moshe Bar, has announced that the
>> openMosix project is ending.
>>
>> http://sourceforge.net/forum/forum.php?forum_id=715406
>>
>> While I haven't used openMosix, I've seen it used and it is
>> pretty cool to see processes move around nodes.
> 
> Yeah, but it has nearly always had a few tragic flaws.  One was that it
> was always basically a hack of a specific kernel version and image,
> meaning that if you used it you were outside of a working kernel update
> stream.  The second was that it was basically a hack of a specific
> kernel version and image at all, where one really would prefer a tool
> that did the same thing outside of kernel space (like Condor, for
> example).  It survived those flaws, of course -- but it cannot survive
> the advent of virtualization, which will provide new pathways for this
> sort of thing to be done with far greater ease and stability.

Hello, Robert.

I've been using openMosix for a long time, and you're right about the 
kernel 'trap' it puts you into. I recently 'ported' linux-2.4.26-om1 to 
Ubuntu. Although I've succeeded in getting our 92-node Beowulf up and 
running openMosix under Ubuntu 6.06.1 LTS the end of life announcement 
means I have to start thinking about replacing it.

Do you really think that Condor is an alternative to openMosix?

I don't know much about Condor, but I thought is was a DRM (Distributed 
Resource Manager) like SGE. Is it more than that?

The great thing about openMosix is that most 'ordinary' programs 
migrate. I've thought about using openSSI previously: What's your 
opinion about that for 'embarrassingly' parallel computation?

Best wishes,

	Tony.
-- 
Dr. A.J.Travis,                     |  mailto:ajt at rri.sari.ac.uk
Rowett Research Institute,          |    http://www.rri.sari.ac.uk/~ajt
Greenburn Road, Bucksburn,          |   phone:+44 (0)1224 712751
Aberdeen AB21 9SB, Scotland, UK.    |     fax:+44 (0)1224 716687
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20070717/4658b6d5/attachment.html>


More information about the Beowulf mailing list