[Beowulf] Application independent checkpoint/resume?

Christopher Samuel chris at csamuel.org
Mon Mar 4 11:41:59 PST 2019


Hi folks,

Just wondering if folks here have recent experiences here with 
application independent checkpoint/resume mechanisms like DMTCP or CRIU?

Especially interested for MPI uses, and extra bonus points for 
experiences on Cray. :-)

 From what I can see CRIU doesn't seem to support MPI at all, and DMTCP 
only supports it over TCP/IP or (with a supplied plugin) Infiniband. Are 
those inferences true?

Any others I've missed?

All the best,
Chris
-- 
   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA


More information about the Beowulf mailing list