Why not NT clusters? Need arguments.
RSchilling at affiliatedhealth.org
Fri Oct 6 14:19:27 PDT 2000
Nice! This is the type of thing that corporate types need to hear. It is a
difficult task to try and convince many managers/supervisors why they should
steer away from NT. Microsoft, although they have come up short on
enterprise-grade "clusterable" machines, has done a great job of convincing
many execs that NT is "good enough", and attainable. Convincing them
otherwise is what you're most likely up against here.
Great data and anecdotes is what it's going to take. . . .
Lake Stevens, WA
> -----Original Message-----
> From: Robert G. Brown [mailto:rgb at phy.duke.edu]
> Sent: Friday, October 06, 2000 1:10 PM
> To: Dan Yocum
> Cc: Jon Tegner; beowulf at beowulf.org
> Subject: Re: Why not NT clusters? Need arguments.
> On Fri, 6 Oct 2000, Dan Yocum wrote:
> > Jon Tegner wrote:
> > >
> > > In a disussion of clusters I got the question why not
> using systems
> > > running microsoft NT. I only came up with cost and stability in a
> > > sweeping way, and I couldnt present more quantitative
> arguments. Later,
> > And that wasn't enough?
> To be more specific, to have a good chance of COMPLETING a longrunning
> parallel computation on N hosts, it really helps if the probability of
> single host failure is considerably less than 1/N over the
> time required
> for completion. This is extremely quantitative. If you estimate that
> the mean time between crashes of your NT boxes is ten days, then you
> typically will almost NEVER complete a computation that runs for a day
> on 20 boxes. This alone is why you won't see many really big clusters
> running NT.
> I've heard anecdotally that an organization has excellently skilled NT
> people that devote enough time to the project they can tune and
> configure NT well enough to be stable out at 30-60 days (or even more)
> and build a workable cluster out of it. This often limits to some
> extent the applications they'll allow to be run, as some applications
> are more destabilizing than others. If this point is raised, you can
> counter that:
> a) That extra time and skill costs money. Quite a lot of
> it -- humans
> are often more expensive than hardware, and really skilled NT SE's are
> no more common than any other variety, however many "MCSE"'s there are
> floating around in the world. We all know that one cannot learn to
> stabilize a complex operating system in a correspondance course or
> community college type environment.
> b) Linux is more stable than the most stable NT platforms
> you're ever
> likely see right out of the box. The latter 2.2 kernels are
> simply rock
> solid on all but a few very rare hardware combinations. It still
> requires a skilled individual to install and administer it, but
> stabilizing it isn't rocket science. It also scales very well
> administratively, especially using tools like kickstart or some of the
> diskless boot mechanisms described on this list.
> c) THEN you can point out the hundreds of dollars per platform you
> save on OS software and other software. This is actually not
> that much,
> compared to the human costs, unless you have a lot of platforms -- one
> reason a lot of institutions might reasonably give for not making a
> > > I even found that an nt cluster sits on place 207 on the
> top500 list
> > > (see http://www.top500.org/lists/TOP500List.php3?Y=2000&M=06)
> > > is that an exception, or are there many of these beasts around?
> > Check out how many linux cluster are far above that on the list...
> > actually, I'm sure you'll find many more Linux clusters
> there than NT
> > clusters.
> The top500 list ranking per se also doesn't address the usability of
> that cluster. There have been systems on that list before
> that were so
> unstable they (according to rumor, anyway) could barely get through
> benchmarking runs and were used more for computer science (short runs)
> than for parallel application production (long runs). I
> don't know what
> fraction of them were NT systems (if any) but the
> preponderance of linux
> is due to its lower cost and higher stability. People "vote"
> with their
> purchase decisions, and your people would be most unwise to ignore the
> wisdom of the masses, especially when the masses who build top500
> machines in the first place are among the best and brightest (cluster
> computing) computer people in the world.
> Robert G. Brown
> Duke University Dept. of
> Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
> Beowulf mailing list
> Beowulf at beowulf.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf