[Beowulf] non-stop computing

Guy Coates guy.coates at gmail.com
Thu Oct 27 08:38:15 PDT 2016


BLCR or DMTCP should both be able to checkpoint a single node job (single
or multi threaded) straight out of the box; you won't need to recompile any
of your binaries.

DMTCP does not require any kernel modules, and so you might find that
easier going if you are on a more recent kernel than BLCR supports. (DMTCP
also seems to do a better job handling MPI jobs than BLCR does, if you care
about those.)


Thanks,

Guy

-- 
Dr. Guy Coates
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20161027/9e1c2898/attachment-0001.html>


More information about the Beowulf mailing list