Why not NT clusters? Need arguments.
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduFri Oct 6 13:09:42 PDT 2000
- Previous message: Why not NT clusters? Need arguments.
- Next message: Why not NT clusters? Need arguments.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, 6 Oct 2000, Dan Yocum wrote: > Jon Tegner wrote: > > > > In a disussion of clusters I got the question why not using systems > > running microsoft NT. I only came up with cost and stability in a > > sweeping way, and I couldnt present more quantitative arguments. Later, > > > And that wasn't enough? To be more specific, to have a good chance of COMPLETING a longrunning parallel computation on N hosts, it really helps if the probability of single host failure is considerably less than 1/N over the time required for completion. This is extremely quantitative. If you estimate that the mean time between crashes of your NT boxes is ten days, then you typically will almost NEVER complete a computation that runs for a day on 20 boxes. This alone is why you won't see many really big clusters running NT. I've heard anecdotally that an organization has excellently skilled NT people that devote enough time to the project they can tune and configure NT well enough to be stable out at 30-60 days (or even more) and build a workable cluster out of it. This often limits to some extent the applications they'll allow to be run, as some applications are more destabilizing than others. If this point is raised, you can counter that: a) That extra time and skill costs money. Quite a lot of it -- humans are often more expensive than hardware, and really skilled NT SE's are no more common than any other variety, however many "MCSE"'s there are floating around in the world. We all know that one cannot learn to stabilize a complex operating system in a correspondance course or community college type environment. b) Linux is more stable than the most stable NT platforms you're ever likely see right out of the box. The latter 2.2 kernels are simply rock solid on all but a few very rare hardware combinations. It still requires a skilled individual to install and administer it, but stabilizing it isn't rocket science. It also scales very well administratively, especially using tools like kickstart or some of the diskless boot mechanisms described on this list. c) THEN you can point out the hundreds of dollars per platform you save on OS software and other software. This is actually not that much, compared to the human costs, unless you have a lot of platforms -- one reason a lot of institutions might reasonably give for not making a switch. > > I even found that an nt cluster sits on place 207 on the top500 list > > (see http://www.top500.org/lists/TOP500List.php3?Y=2000&M=06) > > is that an exception, or are there many of these beasts around? > > > Check out how many linux cluster are far above that on the list... > actually, I'm sure you'll find many more Linux clusters there than NT > clusters. The top500 list ranking per se also doesn't address the usability of that cluster. There have been systems on that list before that were so unstable they (according to rumor, anyway) could barely get through benchmarking runs and were used more for computer science (short runs) than for parallel application production (long runs). I don't know what fraction of them were NT systems (if any) but the preponderance of linux is due to its lower cost and higher stability. People "vote" with their purchase decisions, and your people would be most unwise to ignore the wisdom of the masses, especially when the masses who build top500 machines in the first place are among the best and brightest (cluster computing) computer people in the world. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: Why not NT clusters? Need arguments.
- Next message: Why not NT clusters? Need arguments.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
