[Beowulf] HPC fault tolerance using virtualization
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
John Hearns hearnsj at googlemail.comMon Jun 15 10:59:37 PDT 2009
- Previous message: [Beowulf] Data Center Overload
- Next message: [Beowulf] HPC fault tolerance using virtualization
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I was doing a search on ganglia + ipmi (I'm looking at doing such a thing for temperature measurement) when I cam across this paper: http://www.csm.ornl.gov/~engelman/publications/nagarajan07proactive.ppt.pdf Proactive Fault Tolerance for HPC using Xen virtualization Its something I've wanted to see working - doing a Xen live migration of a 'dodgy' compute node, and the job just keeps on trucking. Looks as if these guys have it working. Anyone else seen similar? John Hearns
- Previous message: [Beowulf] Data Center Overload
- Next message: [Beowulf] HPC fault tolerance using virtualization
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
