[Beowulf] backtraces
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at mcmaster.caMon Jun 11 19:00:02 PDT 2007
- Previous message: [Beowulf] backtraces
- Next message: [Beowulf] backtraces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> Sorry to start a flame war.... what part do you think was inflamed? > Make sure that your code generates the exact same answer with debug/backtrace > enabled and disabled, part of the point of my very simple backtrace.so is that it has zero runtime overhead and doesn't require any special compilation. > then you add user-level checkpointing so that you can I'm most curious to hear people's experience with checkpointing. all our more serious, established codes do checkpointing, but it's extremely foreign to people writing newish codes. and, of course, it's a lot of extra work. I'm not arguing against checkpointing, just acknowledging that although we _require_ it, we don't actually demand "proof-of-checkpointability". > restart where you want. Then you > run up until the problem and restart with the last checkpoint. restarting from checkpoint is fine (the code in question could actually do it), but still means you have hours of running, presumably under a debugger. > Run for a week without checkpointing? Just begging for trouble. suppose you have 2k users, with ~300 active at any instant, and probably 200 unrelated codes running. while we do require checkpointing (I usually say "every 6-8 cpu hours"), I suspect that many users never do. how do you check/validate/encourage/support checkpointing? part of the reason I got a kick out of this simple backtrace.so is indeed that it's quite possible to conceive of a checkpoint.so which uses /proc/$pid/fd and /proc/$pid/maps to do a possibly decent job of checkpointing at least serial codes non-intrusively. regards, mark hahn.
- Previous message: [Beowulf] backtraces
- Next message: [Beowulf] backtraces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
