Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] HPC fault tolerance using virtualization

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

John Hearns hearnsj at googlemail.com
Mon Jun 15 10:59:37 PDT 2009


I was doing a search on ganglia + ipmi (I'm looking at doing such a
thing for temperature measurement) when I cam across this paper:

http://www.csm.ornl.gov/~engelman/publications/nagarajan07proactive.ppt.pdf

Proactive Fault Tolerance for HPC using Xen virtualization

Its something I've wanted to see working - doing a Xen live migration
of a 'dodgy' compute node, and the job just keeps on trucking.
Looks as if these guys have it working. Anyone else seen similar?

John Hearns



More information about the Beowulf mailing list