Beowulf Questions
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Greg Lindahl lindahl at keyresearch.comTue Jan 7 11:42:45 PST 2003
- Previous message: Beowulf Questions
- Next message: Linux Central version question
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Jan 07, 2003 at 12:22:12PM -0600, Randall Jouett wrote: > With all kidding aside, I can see how (in some applications) > check-point files are and absolute necessity. My only beef > with the situation is that a large amount of time is being > spent doing IO on a "maybe." I do, however, see how they > can be useful. Most people don't waste large amount of time. What they do is compare the average loss of computation due to a failure with the loss of computation due to the extra I/O. Example: My machine fails on average every 24 hours. It takes me 1 hour to checkpoint. Therefore if I checkpoint every 8 hours, the average loss from a failure is 4 hours, and I spent 3 hours doing I/O. That's an ASCI-class example; most small clusters only need a few minutes to checkpoint and have a failure every month. -- greg
- Previous message: Beowulf Questions
- Next message: Linux Central version question
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
