[Beowulf] New member, upgrading our existing Beowulf cluster
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Prentice Bisbal prentice at ias.eduTue Dec 8 10:52:28 PST 2009
- Previous message: [Beowulf] New member, upgrading our existing Beowulf cluster
- Next message: [Beowulf] New member, upgrading our existing Beowulf cluster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Lux, Jim (337C) wrote: > > > On 12/8/09 9:22 AM, "james bardin" <jbardin at bu.edu> wrote: > >> On Tue, Dec 8, 2009 at 10:50 AM, Prentice Bisbal <prentice at ias.edu> wrote: >> >>> You'd hope that. Most of my current clusters users are scientific >>> researchers in academia, not computer scientists. While some are >>> extremely computer savvy, others have learned just enough about >>> programming to do their calculations. Expecting the latter to write code >>> with checkpointing is unrealistic, and working in academia, I can't >>> force them to. Which is why taking down 4 nodes instead of just one is >>> less than ideal. >>> >> I find it's still advantageous to push them to learn it. A researcher >> working with a tight deadline for a grant will often see the light >> when a hardware failure loses them a month or more of data processing. >> It really is in their own best interests to learn about their tools. > > > What about some form of "image checkpoint" like "hibernation"... Should be > application unaware, just snapshots memory. That's fine when the problem is on one system and there's only one system image to worry about check pointing once you start spreading the job around to multiple systems, things get complicated, especially if your node is heterogeneous w.r.t hardware. I fear we're straying off the topic of the original post... -- Prentice
- Previous message: [Beowulf] New member, upgrading our existing Beowulf cluster
- Next message: [Beowulf] New member, upgrading our existing Beowulf cluster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
