updating the Linux kernel
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Crutcher Dunnavant dunna001 at bama.ua.eduFri Jun 9 19:45:50 PDT 2000
- Previous message: Linux kernel bug
- Next message: updating the Linux kernel
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Now, I might completly miss something here, but shouldn't all *distibuted* parallel programs assume that a node may not return. After all, what do you assume about hardware failures? So, while it may not be a *good* way to do it, In a properlly paralized application, shouldn't you be able to take down any random node other than the job allocation node, AT ANY TIME, and have that job reallocated (Yeah, you lose the local work, but those tasks should be checkpointed frequently). I just don't think that you should EVER be able to lose more than 5-10 minutes worth of work on a given node, and if you can, you should re-examine your program design. So just kill the boxes, and update them, one at a time. -Crutcher Dunnavant "Elegant, Documented, On Time; Choose 2" Email: dunna001 at bama.ua.edu Resume: http://resumes.dice.com/crutcher Home:(256)-232-7883
- Previous message: Linux kernel bug
- Next message: updating the Linux kernel
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
