Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Reliable Job Queueing and Notification

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Bernard Li bernard at vanhpc.org
Wed Oct 17 11:31:26 PDT 2007


Hi Sean:

On 10/16/07, Sean Ward <SeanWard at msn.com> wrote:

> I've started work on a web service which contains several potentially
> long running processing steps (molecular dynamics), which are perfect to
> farm out to the fairly large (90 node) Beowulf I have access to. The
> primary issue is translating requests from the event driven web service,
> to job queues, and back again upon completion. Specifically, the major
> queuing systems I have immediate access to (Sun Grid Engine and Condor)
> only support e-mail based notification of job completion. Starting jobs
> isn't an issue, as my service can simply ssh over and execute shell
> scripts as needed to start things up, the problem is reliably being
> informed when the jobs fail or complete, via any programmatic method
> (such as executing a shell script, calling a web service via SOAP/etc,
> or an asynchronous message library). My other problem, ensuring that
> these web service requests don't starve in house jobs on the Beowulf is
> easily handled via the priority levels built into all the various job
> managers, although being able to checkpoint a long running job would be
> a plus (such as is supported by Condor).
>
> I am currently investigating modifications to either Condor (more
> complex to update, but checkpoint is useful) or Ruby Queue (very easy to
> update for reliable notification) to solve this issue, but wanted to be
> sure I wasn't overlooking any existing solutions to programmatic based
> queuing and receiving notifications on jobs in a Beowulf environment...

If you plan to stay with the SGE/Condor route, you should take a look at DRMAA:

http://drmaa.org/wiki/

Perhaps you will find something useful there.

Cheers,

Bernard



More information about the Beowulf mailing list