[Beowulf] Why I want a microsoft cluster...
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduMon Nov 28 08:11:53 PST 2005
- Previous message: Small clusters Re: [Beowulf] Why I want a microsoft cluster...
- Next message: [Beowulf] Why I want a microsoft cluster...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 23 Nov 2005, Jim Lux wrote: > At 05:15 PM 11/23/2005, Joe Landman wrote: > > >> Jim Lux wrote: >>> At 01:30 PM 11/23/2005, Joe Landman wrote: >> >> [...] >> >>> This is particularly pernicious for documents that get viewed with a >>> projector, and then get zoomed to look at the details. >> >> Agreed. This is a good reason to use vector formats in general whenever >> possible. > <gigantic snip> > > Gosh.. the devil's gonna die.. I did my best to advocate. > > Ultimately though, to update the aphorism: "nobody ever got fired for > recommending MS" > > Many similarities between "big blue" a few decades ago and big whatever color > they are (well, it *was* a sort of forest green back in the 80s) these days, > not the least of which is marketing strategies. > > Enjoy your holidays everyone.. (those of you in the U.S., anyway.. the rest > of you will have to toil while we engage in the national festival of > overeating) Uhhhhh, can't believe I ate all that... OK, I'm back now, and it looks like Joe did a pretty good job of saying everything I might have said and then some. A very few late addenda to further beat this dead mule: a) As Joe pointed out, there is no substitute for competence. A number of issues raised in support of Windows clusters were (in essence) "Windows systems managers are incompetent klutzes who would do something silly like run virus checkers, per node, on an NFS mount" or "Windows systems managers would blindly implement Windows-centric security policies on the linux system". While this is possibly true, it isn't really relevant except from the MS marketing point of view and as Joe pointed out would screw up a WinXX cluster as easily as anything else. I think a safer assumption to make is one of presumed competence in whatever shop is considering winsux vs linux clusters. In which case they'd presumably know (or be smart enough to learn or be deep-pocketed enough to buy the knowledge from folks like Joe) enough to configure a "sensible" cluster as opposed to a silly cluster -- one with the cluster nodes e.g. behind a firewall, on a 192.168 private internal network but otherwise flat within the organization, etc. They would also know or would learn quickly when considering the issue that many of the linux cluster configurations they might consider basically boot a single, stripped cluster image giving them a SINGLE SYSTEM to secure. The nodes aren't really, individual security risks unless you're running a NOW type cluster (a possibility that this list tends to be a little bit blind to). If one DOES consider a NOW-type cluster then a whole RAFT of security issues exist WinXX, but they are ones you have to handle anyway. There are fewer issues for linux -- see below -- but... b) a WinXX NOW cluster is a possibility that VERY DEFINITELY exists and is potentially profitable to a WinXX shop, BTW. To help out your diabolical advocacy, consider the following. A mythical organization has 1000 mythical WinXX desktops running email clients, screen savers, Office tools, and a browser. These systems are already installed, managed (well or badly), secured (well or badly), patched (ditto), and are effectively idle nearly all of the time even when somebody is sitting at their console and typing furiously. For most users, each successive boost in CPU speed just increases the already astronomical number of NoP cycles the system spends per cycle of actual work done processing a keystroke or mouse click. This organization therefore very likely has 0.95 x 1000 free cycles already available or doing thumb-twiddling crap like making WinXX logos fly around in 3d. These cycles "could" be doing useful work, but WinXX is if anything anti-engineered for this sort of process -- it is weak on backgrounded tasks in general, scheduling, VM (especially VM that doesn't leak), and network-driven task execution. However, it may be >>good enough<< at multitasking, and network-driven task execution is fundamentally a pretty straightforward problem to solve, especially if you set your sights on low-hanging fruit. That is, MS "could" sell a "cluster tool" that is basically nothing but an integrated, policy-driven job distribution tool so that a user on any (authenticated, permitted) one of these 1000 systems on a standard LAN can submit a job stream and have it farmed out to the "free" cluster of idle desktops according to institutional policy. A nice little cluster management tool would let top level managers set that policy and give them that warm fuzzy feeling of control. Given Windows security track record, of course, I rather expect that most systems managers would be a pretty tough sell on this, at least right at the moment. It's one thing for a single corporate system to get a virus. It's another for the entire corporate LAN to get a virus without any of the tedium or delay of having to rely on social engineering for transmission. Building a sandbox whereby submitted Tasks of Evil don't turn an entire corporation into Hell would be a bit of a challenge. I also don't know how well WinXX would function on nodes with a full time CPU-sucking background task running -- historically this has proven difficult even for a number of Unix schedulers and VM managers (mid 90's Sloaris, anyone?) and my direct experience of this on the one gaming system I run Windows on (where games are, in a sense, HPC applications BTW) is that this will be a really serious problem for the current generation Windows kernels as well. I have never been impressed with WinXX's ability to multitask, but it wouldn't have to multitask WELL as long as they were able to tune it so that desktop application performance didn't suffer. This could be done at the REALLY coarse-grained task level and still win -- as in run the BG application instead of a CPU-sucking screensaver or OUT of the screensaver manager and using the same exact controls. This >>would<< really be amazingly simple to code, and with an integrated front end making a WinXX NOW cluster that can do a "mosix"-like embarrassingly parallel job redistribution at a cost of (say) $100/node/year for the client and job management daemon, it would even make economic sense for at least some shops. MS makes $100K. The organization recovers the equivalent of a 950 node cluster for roughly 10% of the cost of a dedicated-function cluster of the same size, far less than that if infrastructure requirements and scaling are taken into account. IF MS makes things work so you can recover 95% of the CPU and not impact desktop performance, this is a huge win, and gives MS a bit of leverage and experience to make their tool (or a brand new parallel programming suite tightly coupled to their programming tools) work for e.g. MPI apps or other real parallel apps. Even 80% recovery of CPU would be a solid win -- the fact that linux permits more like 98% recovery without impacting desktop usage (given sufficient node memory) is irrelevant. c) You were really unfair to linux on the security side. Windows managers all KNOW that linux is secure and windows is not -- not absolutely of course in either direction, but sufficiently that I'm pretty safe making the absoolute statement anyway. Windows managers tend (if anything) to be jealous of linux managers on this very issue. This (and scaling) is one of the major reasons that many places have linux servers, whatever they run on the desktops. At Duke our campus IT security person is just happy as a clam about linux because linux at duke installs itself in an auto-updating pull mode that yum resync's to the campus repository(s) every night. Linux boxes on campus therefore get security updated even if the owner knows "nothing" about security, and toplevel management has to control and defend a single set of toplevel servers to keep it that way. NOBODY is happy about WinXX from a security point of view. Updating isn't done nightly and transparently, where (in linux) most users never are even aware that their system has been updated and patched or that the application they run today isn't the same as the one that they ran yesterday because a bug they hadn't ever encountered is now fixed. Updating Windows is done rarely, after testing, and with great trepidation because it can do anything from breaking nothing to breaking everything to breaking SOME things. Nightmarish is a reasonable term for it. It is also trivial to install linux so that it is "identical" desktop to desktop across an organization. This can be done with winsux, but it often ISN'T done because it isn't quite as simple. However, this is really a competence issue so let's just assume that everybody is competent so that it is. The point is still that linux right out of the metaphorical box is far more secure than WinXX is after investing quite a bit of effort. Linux competently installed on top of e.g. kickstart files from a well-maintained yum-driven repo that mirrors the security updates streams for the distro in question is very, very secure AT THE DESKTOP, and still more secure (depending on cluster architecture) at the cluster level. d) It is also important to be fair on the management scaling side. Linux scales at the theoretical limit of management scalability. One (single) person can manage the install/update repo for an organization, and yes, a COMPETENT organization will restrict all users to use the one (or one of the) supported distribution(s), just like they wouldn't let users run win95, win32, winXP, winme, winnt all at the same time on different desktops (unless there were cost-beneficial reasons to do so). Given this person, at the departmental LAN or cluster level the number of systems a person can care for is almost completely independent of the software. It is limited by the frequency with which the hardware breaks plus the number of requests for e.g. training or user-level software support, per system per user per day. If all hardware is tier 2 or better -- 3 year onsite service, competent design reliable choices -- one person can care for from 100s to as many as 1000 linux systems from the hardware point of view on a 24-48 hour service basis (where you don't need "overnight call" or coverage). Furthermore, this service can be done just as easily by WinXX trained staff as linux trained staff. Hardware is hardware; the only issue is having ONE person in your organization who sets policy as to what hardware you get on the linux side to avoid potential device driver issues. User support issues vary wildly per organization and are difficult to categorize in any simple way. A single user can (as sysadmins on list can well attest) suck down inordinate amounts of support REGARDLESS of the operating system they use, and you might be supporting dozens of these incompetent, personality disordered, life-sucking weasels who call you up in the middle of the night and blame you personally if their home ISP is for some reason slow or they found a website that dumps code that freezes their browser and ultimately their interface (where none of your OTHER 300 users has ever had a problem). Or you can have hundreds of highly competent users that never need to be taught that to print a document you click these little buttons and look for the printer down the hall and be sure not to pick one from three buildings over that HAPPENS to appear on the list due to the miracle of printer sharing over the network and that promiscuously accept print jobs from anybody. However, >>working<< at an institution with a wide range indeed of mixed Win/Lin LAN configurations, I know of no reason to believe that WinXX user level support is likely to be cheaper, ever, once you've hired the minimum 1-2 linux people required for a minimum buy-in to linux (one for small, two for large). There is a nontrivial startup cost, sure, but from what I've seen HERE, at least, if anything linux support costs scale better than winsux support costs across the board. It is cheaper at the server level (by far). It is cheaper to install (and not just because of free software -- it is cheaper in HUMAN terms to install). It is identical to support at the hardware level, EXCEPT for device selection -- you have to be more careful to validate any given hardware arrangement for linux, but once validated it tends to be identical. It is anecdotally somewhat better to support linux at the user level, certainly in homogeneous environments (all lin vs all win) but still largely true in a mixed environment. We have a relatively small number of WinXX boxes but manage to get support requests from their users at almost the same rate we get them from linux users, possibly because of their relative competence (lin tends to be used by e.g. faculty and students, win by secretarial staff). However, we also have win-only labs that have a crisis a week, it seems like. This is the point I was making last week -- ONCE AN ORGANIZATION PAYS THIS BUY IN COST (1-2 competent linux sysadmins) the marginal cost per additional linux seat, be it desktop or cluster node, is strictly less than that of an additional winsux seat, with the sole exception of interoperability costs -- integrating OOffice desktops with MS Office desktops. This actually (as Joe has pointed out) "works" pretty well these days for most things, but there are enough things for which it doesn't work that it can create problems or additional work or some restrictions on usage. This gets back to competence and cost/benefit again, as one can ARGUE that using MS Office at all is a fundamentally incompetent thing to do in any institution that wishes to archive the documentation produced by its office suite tools so that they are recoverable ten years from now. Word's .doc format is not, actually terribly standard or portable, as anyone who has tried to reopen an old archived Word document has doubtless already learned the hard way. Document management is an issue that many organizations INCLUDING ones that are otherwise competently run handle very, very poorly, literally gambling that whatever document format they are saving into archives today will be recoverable in a decade. The ability to actually file those documents in a crossreferenced, keyword string searchable format is similarly lacking. The fact that the documents tend to be scattered all over an organization's mounted filespace is another problem. Windows is far from homogeneous here and notorious for its lack of backwards compatibility, as it is ultimately this that "forces" an organization to update WinXX including Office across the institution. You have just as much difficulty with Tommy in accounting using Win98 and an old version of Office with Sally in management using WinXP Pro with a sparkling modern version of Office. Sally writes a memo too Tommy and Tommy cannot read it or respond. Open Office would do (if anything) BETTER. The difference is that updating Tommy's desk to the latest greatest WinXX and Office will cost (very likely) $100's in software, hours of sysadmin time, and a bit of training. Updating Tommy's desktop to e.g. gnome and open office would take $0 in software, ten minutes of sysadmin time (long enough to initiate a pxe-driven boot), and somewhere longer in training. Again, this is competence -- your argument is the homogeneity is cheaper than heterogeneity, and I learned that the hard way back in the mid-80's so I can hardly disagree now. However, inhomogeneity can have benefits as well, so to competently determine the correct degree of INhomogeneity an institution should seek requires a cost-benefit analysis. Is it cheaper in the long run for the institution to invest in the two people required to get linux started so that it CAN update Tommy to linux AND pay the training costs to get Tommy up to speed on linux-based replacements for his standard tools? Not a simple question to answer, and no SINGLE answer will be universal. However, it is undeniable that it is a lot easier to make the move if the organization already has a couple or three linux jocks on its IT staff, perhaps to run servers, perhaps to run dedicated function linux clusters, perhaps because engineering insists on using linux regardless of what accounting wants to run. This reduces the MARGINAL cost of linux still further and provides that dangerous pathway towards a phase transition. The "phase transition" approach is one that paradoxically works best in tightly controlled topdown fascist management schemes. In order to achieve clearly visible CBA wins and achieve corporate goals, a corporation bites the bullet and installs some linux systems -- probably a linux cluster of one sort (HA) or another (HPC), maybe a few desktops in technical departments. They hire 5 linux superheros to run all of this. A year later they notice that those superheros are mostly playing video games because yum is doing all of the nightly maintenance, the hardware and software profiles of the systems they manage rarely change, the software is stable and works well, and once their users got weaned from winsux and retrained in linux, they seemed happy enough. The IT person then asks "could we become a linux-only shop"? Next thing you see, it's the "Burlington Coat Factory" story -- 5000 linux POS systems, 2000 servers, and linux desktops everywhere possible, at the cost of millions up front plus millions more in operational efficiencies. This is the MS nightmare -- so far there haven't been THAT many total conversions, but EVERY total conversion is a template for ten more. Up until the last two years, MS was reluctantly conceding parts of the server marketplace, arguing correctly that linux was growing more at the expense of Unix than of MS. Over the last couple of years, though, the linux desktop has significantly improved and (as Joe has noted) linux has proven capable of integrating more or less seamlessly into a win/lin mixed environment. The remaining strikes against linux continue to be hardware (which tends to be a lot more differentiated at the desktop level and which linux does poorly for features like e.g. dvd support, printer support, camera support that are likely to be randomly requested by various groups or individuals) and midlevel business applications. THIS, not the desktop, is what I personally have seen as the last bastion of resistance in at least the company I sit on the board of. They converted to linux servers and are happy as clams. They converted to linux POS and linux desktops and are happy enough -- their employees needed at most a few extra days of training, as if you can use Explorer you can use Mozilla or galeon or firefox or netscape, if you can use MS Office you can use Open Office for just about anything anybody is likely to need to do in most organizations (a memo or letter being pretty easy in ANY WP). The killer is middleware -- accounting applications, office (non-Office) applications, personnel management applications, database applications, integrated applications. There are CHOICES out there for Windows -- many of them quite expensive, of course, but they are there. There aren't a lot of choices there for linux. Either there is an open source effort or there is nothing. If there IS and OS effort, either it works and is supported pretty well and can be implemented without a lot of hackery or (for most organizations) there is nothing. This is a bigger issue for small and midsized corps than it is for large ones -- the large ones have the opportunity cost systems programmer time required to do the hackery/glue to make the OS solutions work, the smaller ones need shrinkwrap solutions. Either one can use consultants to make up the difference, but there is a cost here as well. If/when linux solves the device driver issue, one major barrier to linux-only or linux growth at the direct expense of winsux in mixed environments goes away. If/when consultancies like Joe's branch out into the corporate middleware market and/or start to market software (maybe even CLOSED source software) for linux at the corporate level, another one goes away. Then we'll see what happens. In the meantime, look out for a task-distribution, mosix-like addition to Windows. Turn your lan into a NOW, at only $100/seat and with no impact to your existing utilization! That's been one of the major advantages of Unix from way back in the late 80's and early 90's when I was routinely doing this across Unix LANs. One of the major DISadvantages of MS-based systems is that they have NEVER been able to do this. Adding this feature to Windows won't even be a serious programming challenge, and they can probably arrange for it to happen using tools and libraries that they fully own and control so that applications written to use it (as opposed to run EP from the interface) are non-portable. That should make things "interesting", don't you think? rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: Small clusters Re: [Beowulf] Why I want a microsoft cluster...
- Next message: [Beowulf] Why I want a microsoft cluster...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
