[Beowulf] MS HPC... Oh dear...
Robert G. Brown
rgb at phy.duke.edu
Mon Jun 12 14:40:15 PDT 2006
On Mon, 12 Jun 2006, Vincent Diepeveen wrote:
> We will confront you with your statement in a few years from now.
Go for it;-)
Note to all others: The following is a patented rgb rant with no
otherwise meaningful content. Hey, it has been a while...:-)
Anyway, feel free to hit the "d" key now and skip it.
> If microsoft doesn't price their server/cluster stuff too expensive then in X
> years from
> now they'll dominate the highend market. Microsoft always has just taken
> markets by
> storming in giving away copies of their software for near free initially.
> Competition is hardly possible against that from software viewpoint.
Um, Vincent, you ARE aware that Linux is FREE software -- as in FREE
free, free as in air, beer, birds -- aren't you? As in I haven't paid a
company for linux in years and years, and only bought the occasional
box-set copy of Red Hat back when I did as a voluntary contribution?
To turn your own observation around, "Competition is hardly possible
against that from software viewpoint"... especially since Linux "HPC"
isn't being given away free "initially" -- it is guaranteed to be free
free by the GPL and other open licenses used throughout pretty much
forever. This isn't a case where Microsoft can, um, "undersell" its
competitors short of giving them money along with their product -- which
I fully expect them to do, by the way, especially initially -- and any
effort they expend in this direction is just money pissed away except in
very limited and specialized commercial markets.
You also missed another of my points. It has been possible to write
parallel software that runs on Windows boxes since maybe 1993 (can't
recall the exact date that somebody did the Windows port of PVM, but I
vaguely recall seeing Windows ifdefs in the source about then). There
have been plenty of groups with many Windows-based desktops available,
sitting nearly idle 90% of the day, pretty much forever. These systems
have always been "free" in the sense that they are already there and
paid for and are sitting idle.
So why has development of parallel software to RUN on these "free"
Windows systems just plain never happened? Because the development
PROCESS was very, very, very EXPENSIVE, that's why. It would have been
easier under DOS, with DOS's primitive network stack -- at least under
DOS it was relatively easy to launch a process with its own "terminal"
resources, so the OS would know what to do with stdin/stdout/stderr.
Compilers alone were (and are) expensive. It was (and remains) a total
PITA to access a node from a single seat -- Windows WANTS you do to
everything at a system console. Only if the result were worth a lot of
money would it ever have been worth it, and even then it would have cost
much LESS money under Linux (as it already did under other flavors of
Unix) once it came into existence.
So please understand, the cost of Windows, especially vs Linux, is NOT
an advantage of Windows in this discussion except MAYBE in Windows-only
shops with a high marginal cost to "start" using Linux where before they
weren't. Elswhere sure, MS may well give it away to try to get market
share. It won't matter. They'd have to PAY people to use it instead of
Linux to overcome the difficulties people will encounter when they try
programming it. The only groups of people who will be interested (I
think) are commercial developers seeking to make a shrink-wrapped
product for people who want a turnkey cluster, and people who are happy
paying out of the nose for just such a cluster. People who are content
with being locked into a totally non-portable schema for their future
parallel computing needs, at that. At the moment, at least, this is a
fairly small chunk of all cluster usage.
Also everybody needs to realize that Microsoft didn't decide "yesterday"
that the cluster market was important. They've been trying to crack it
for years. Think of the Cornell site -- a model that hasn't exactly
proliferated, but not for lack of effort. Think of the occasions over
the last umpty years when a MS employee has come on list and tried to
co-opt it and get the list to recognize Windows clusters as "beowulfs".
Heck, back in the days of NT they offered to give >>me<< NT licenses for
whole clusters of Dell computers we'd gotten as part of an Intel
equipment grant and assign us our very own Microsoft-paid software
engineer to be our very own slavey to facilitate the porting of code and
all if only we'd consider running Windows on our clusters instead of
Linux (where we had to WORK to get linux to run at all, mind you). They
made noises about giving us access to OS source code and everything. We
wanted to get work done instead, and declined. The SMP systems, running
linux (mostly 2.0.x!) throughout, were finally retired years later with
a record of maintaining a duty cycle in the high 9's over their entire
lifetime -- basically never crashing except due to hardware failure,
once we got their adaptec drivers stabilized.
It is also well worth remembering that in larger institutions, running
linux servers is ALREADY well-known (and has been so known for years) to
be cost effective relative to WinXX servers for so many, many reasons.
In fact, a lot of places run linux servers and e.g. samba to service
their windows clients. Look at the cost scaling of Windows server
licenses to Windows clients some time -- the number of clients they say
you can support before you need to buy another server. A single Linux
server can handle many times more Windows clients than a Windows server
-- for free. Look at security. Look at ease of maintenance, especially
remote access and maintenance.
I honestly don't think they're going to find a lot of people who go
"Gee, at LAST, now we can do cluster computing and not have to support
linux any more". It will be more an issue of either "Gee, we already run
linux servers and clusters, why in the hell would we use this unless you
PAY us to port to it and use it (which is pretty much what they did at
Cornell)" or "Gee, we're already paying a ton of money for our twenty
nfive copies of Windows Server to run our 125 clients, we'd simply LOVE
to pay you two tons more to get 1024 Windows Cluster licenses running
from umpty head nodes, as long as you don't make us learn that nasty old
Linux..." The latter argument being put forth, of course, by the
well-entrenched admin staff consisting of MCSEs, just as once upon a
time not so long ago it was put forth by IBM mainframers and COBOL
programmers and DECnet administrators and...
So sure, Microsoft will doubtless define whatever they accomplish here
as "success". They've been trying to crack the cluster marketplace for
something like eight or nine years now, at least -- they started as soon
as the Top 500 started to be dominated by cluster after cluster, none of
them using Microsoft products of any sort on them. Without success --
the cluster market has not been terribly tolerant of cost-inefficiency
and indeed is one of the most visciously cut-throat marketplaces on the
planet in many ways, and Windows makes Microsoft a huge profit margin
for a REASON, and that reason ain't its end-user cost efficiency... or
its high quality and features. It's because they achieved a monopoly
the old-fashioned way -- by driving its relatively few competitors out
of business while biting the very hand that made them what they were at
the time (IBM's). Mostly. Enough. Even then they only succeeded
because Sun Microsystems was stupid and didn't drop the cost of their
perfectly usable x86 Unix to $50/seat on Intel hardware, bribe into
existence some mission critical software, and get there fustest.
So, perhaps they've finally identified that ideal rich-but-stupid
segment of potential cluster customers that can make them high-margin
money; perhaps they've decided that they have to get into the market
even if it is a dead loss forever or lose market share elsewhere, who
knows? We'll see how long it takes for them to buy themselves a top 10
cluster somewhere, like Apple did a few years back. Did the apple
cluster materially affect the dominance of linux/x86? It did not. Will
Microsoft's playing exactly the same game make any significant
difference? I honestly doubt it. And who is going to help them? IBM
still smarts from being screwed over OS/2 and is just itching to get
oh-so-polite revenge. They pay lip service to Microsoft where
necessary, but inside IBM they tend to LIKE linux. Apple may run
Office, but they despise Microsoft. Hardware vendors may well be
arm-twisted into fronting them on HPC as they have desktop Windows in
the past, but only if there is a huge market demand, not the other way
around (to create such a demand).
Note that in any of these cases they are/will be going for that
shrink-wrap market -- Apple and MS more competing with each other than
with standard Linux clusters in a typical research or industrial
>> From my multiprocessor product i'm not releasing a linux multiprocessor
> version, to
> give 1 obvious example.
> Porting the GUI to linux is simply too much work, even though we are open-gl
> at the
> moment and porting should be theoretically possible with just X weeks work.
> Microsoft dominates because all GUI's are running under windows in a way
> that users can work with it.
That's your choice. It's probably a wise one -- all users can work with
a GUI on ANY system (that's the whole idea:-), but Linux users are
notoriously unwilling to pay people for software in the first place.
They'd be more likely to look at what you've got and clone it. However,
I do think that true wisdom is writing a GUI that is cross-platform
portable by design, and not locking yourself into or out of any
particular market, if possible. If you're using Open GL it sounds like
it should be possible. Dunno. I personally am fond enough of Glade and
Gtk, which is purported to go the other way, but my needs are simple.
Yours may not be.
Perhaps the BEST idea is to make your application (computer chess, no?)
have a socket-based API -- maybe XML based -- and make the front end
entirely separate. That way people would probably write your game
interface for you -- right into the existing Gnuchess program, most
likely -- or build it as a PHP or java or Gtk app on top of your API.
This really divorces the choice of end-user platform from the actual
compute engine/cluster, and lets you focus energy where it makes sense,
probably in the latter once you have ANY sort of simple GUI running that
can talk to the engine.
> That said, their huge advantage is dissappearing a bit, as lately i'm under
> the impression they
> no longer have the best programmers onboard now that their stock/shares don't
> yearly double.
There are two, maybe three advantages that Windows continues to enjoy
over Linux at the desktop. One is a truly enormous desktop market share
as a starting point, coupled with a robber-baron mentality in the
software marketplace in general that would have made Cornelius
Vanderbilt blush with shame -- or regret that he didn't think of it
first. For many, many years, MS has shot down any possible threat and
then clubbed the corpse until it stopped twitching, "friend" or foe
alike, whereever the law and billions of dollars in high-margin net
profit permitted. With devastating effect. Where is Borland today?
Lotus? Corel? Netscape? IBM and OS/2? And the list goes on, and many
of the products that remain that don't actually SAY Microsoft on the
cover are sold by companies that MS owns a chunk -- sometimes
controlling chunk -- of (as was recently observed on list regarding
rendering software). It's an utterly old-fashioned monopoly.
This includes cutting deals with hardware resellers that basically make
it suicide not to distribute Windows exclusively and without any real
user choice in the matter (as in "no operating system installed" not
even appearing as a menu option or consumer choice, let alone next to
"install Fedora Core 5" for $25). Buying a controlling interest in any
company that has a popular linux version of its software -- and putting
an end to the linux version on the spot. Helping companies do the
required integration engineering to ensure that their product (hard or
soft) runs under Windows out of the box, and ideally on nothing else
ever. Cloning any really valuable software tool, rebranding it, playing
games with the OS interface that create a perception among customers
that the competing products are unstable, and sowing FUD until they have
a comfortable 70% of the market or so.
As Mel Brooks once noted, "It's good to be King". Not so good to be a
The second is device drivers and hardware devices. Here they are
"accidentally" aided by hardware manufacturers persistance in viewing
device drivers for their own devices as some sort of IP, and resistance
to the very concept of the open ABI as anathema, lest they make it even
easier for their product to be cloned by Taiwanese silicon foundries and
released for 1/3 the cost. Which they inevitably are anyway.
Linus, on the other hand, has absolutely insisted that the linux kernel
will not EVER be made friendly and tolerant of binary insertions. The
combination creates a very definite, very annoying lag between when
hardware first appears (supported by Windows out of the box, of course)
and when Linux can first use it. Linux users have had to get used to
the idea that they just cannot ever count on bleeding edge hardware or
nifty electronic toys working on their systems without either waiting
for a year or investing a lot of effort.
This negatively affects the rate at which Linux has penetrated the
desktop tremendously, perhaps more than any other thing. It irritates
ME and I know what I'm doing and can play the let's hack the drivers
game in a life-threatening pinch -- if I buy (say) a brand new Toshiba
laptop or AMD-64 box, there is a very distinct possibility that one or
more of its components or even the motherboard's basic chipset will not
be recognized by Linux (been hit by both recently) at least not unless
I'm using a bleeding edge version of the kernel or a distro and do a
fair bit of googling and maybe some development version building
(something not everybody can do). It's one of the things that makes
e.g. NDIS so very interesting -- if Linux ever DOES get to where a
"universal device driver" functions that can use any given native
Windows binary driver -- even at some small cost in efficiency -- it's
going to remove one of Windows' most persistent advantages in the
The third is the supported application space. For most people "Office"
is a mission-critical application. Forget about whether or not .doc or
.xls formatted documents are Evil in their basic conception and design
-- the fact is that .doc's are sent all over the place because of the
first point above and linuxvolken need to be able to open them, read
them, write them, mail them back -- unbroken. Similarly, there are many
applications that are written to "use" Explorer as a fundamental part of
installation or operation, there are games that use Windows-based
graphics drivers, there are applications that do "cool things" like
letting you index your photo collection, all available for Windows
(generally for money) but not always for Linux.
Of these three, the first continues to erode. Slowly, to be sure, but
surely (note that I'm talking strictly about the desktop, as MS's server
share has gone up in recent years at the expense of other Unices, at
about half the rate of linux's server share increase). Note also that
there is lots of evidence that the market share published for MS vs
Linux in BOTH the desktop and the server market is signficantly inflated
by the sampling methods used -- it typically counts all the systems in
the world sold with Windows pre-installed, for example, and fails to
count most of the copies of absolutely free linux that are installed
right in on top of those same systems. It tends to count COMMERCIAL
linux sales, that is, which of course ignores the fact that most linux
use by far is by people who do not pay for it. Do they count all the
systems installed from the mirror servers at Duke, for example? I don't
think so. How could they?
This would be thousands of systems ON campus, all "invisible" to current
surveying techniques, and tens or even hundreds of thousands of systems
off campus, and Duke is just a single primary mirror out of many, and
then there are secondary mirrors, tertiary mirrors....
Not even repo server logs can help you figure out just how many -- the
software is distributed directly from online repos in a mirror TREE, and
nodes branch out to actual systems at all sorts of levels in the tree.
As in I have around ten linux systems in my HOUSE and a complete mirror
of both FC4 and FC5 (x64 and i386 both) to support them from a single
rsync of a mirror of a mirror of the FC toplevel repo, and I live
relatively close to the TOP of the tree as Duke has a toplevel mirror.
Then there is the rest of the world, where your choice is pretty much to
steal Windows (commonly enough done, sure, even in the US or Europe) or
get Linux legally, for free. Only one comes with a huge base of free
support, with compilers, with web servers, with the ability to run
client/server networks securely. I wouldn't be surprised if the
worldwide Linux "market" share (measured in installed linux desktops vs
installed WinXX and Mac desktops) is three or four times what e.g. IDC
acknowledges, and it is rapidly growing as linux distros come into
existence that FOCUS on the desktop and appear to be very popular.
The third (application space) has made tremendous strides. Open Office
has all but eliminated the Office gap -- and is one of several choices
available, as usual. Cedega and Wine have lowered the gaming gap, with
some users actually reporting better game performance under linux
emulation than under native windows! And a glance at e.g. Fedora Core
extras gives you an idea of what has happened to the application space
in general -- it is literally exploding with new, cool, GUI based
applications. There is more stuff available in extras for linux for
free than there is in Best Buy for Windows for several thousand dollars.
With yumex one can now SHOP the linux repo chain for those applications
as never before. Software that either doesn't exist period for WinXX or
that exists but costs hundreds of dollars for WinXX is a few mouse
clicks and short download away. Yum may end up being the ultimate
"Windows Killer" application -- in addition to fully automating software
maintenance and providing security updates literally overnight (there
have been linux exploits where the gap from publication to automated
installation of patched updates EVERYWHERE IN THE TREE is as little as
24 hours -- not a lot of room for crackers to get traction in there, is
there:-) yum now permits a truly vast range of available linux software
to be laid out and browsed in the bazaar of the possible, sampled freely
by the end user, all without spending a penny.
I honestly think that the desktop software gap is pretty much closed,
and is if anything leaning inexorably over towards linux. After all,
once a really great GPL application is released for Linux, it tends to
stay "forever" and only improve. There are only so many applications
most people are likely to use or need. When EACH person's application
space is covered, the marginal cost of the linux-windows move (in either
direction) is dominated by the REAL cost of Windows vs Linux per se, a
price war Windows can literally never win at least once the hardware
device issue is resolved. Numerous surveys have shown that the number
of Linux developers continues to rise and overtake the number of Windows
developers, something that is of course really hard to explain if the
surveys concerning "market share" or the perception of real computer
people were anywhere NEAR correct. Developers are voting with their
feet, or in this case their fingers.
At this point one place where the software gap persists (and is VERY
DESTRUCTIVE to Linus's plan of dominating the universe) is in business
middleware. This is the one place on the planet where people want,
need, absolutely insist on shrink-wrapped solutions, and Linux has not
proven itself capable of filling that need with shrink wrapped software.
There are solutions, sure, but they tend to be GPL projects, underfunded
and understaffed. Hobbyware, as it were. It just isn't "fun" to build,
design, maintain business middleware, and nobody has realized that it is
perfectly possible to build AND SELL commercial software in this rather
huge market without much risk that OS developers will come along and eat
your lunch anytime soon. Indeed, what MOST companies want to buy here
is a direct support line and confidence as much as software anyway. I
personally think it is a tremendous business opportunity waiting for
somebody to realize it and sell turnkey Linux-based business middleware
that can talk with equal ease to clients on Lin or Win desktops.
Integrated accounting, payroll, POS, inventory, HR -- dull as molasses
to code and maintain, but absolutely essential to a myriad of small to
midsize businesses that OTHERWISE have a strong interest in running
Linux top to bottom to minimize the overall costs of IT.
The hardware/device problem persists, alas. Linux printer support and
graphics device support have improved tremendously (and the associated
time gap has shrunk accordingly) as a number of linux vendors have
correctly realized that this is really the last place where Windows
holds a significant advantage and focus resources here. Network support
is also improving tremendously, with chipset initiatives, vendor
support, and NDIS promising to close the time gap while a native driver
is developed and handle edge cases. NetworkManager proves to make linux
networking at long last user friendly and automagical, even in complex
environments. USB devices have fortunately tended to be standards
compliant, and Microsoft hasn't managed to monkey with the standards in
such a way that gives them a meaningful advantage here, and USB support
is now pretty good and even automagical. Multimedia (CD's, DVD's) tend
to just work, although DVD playing (for example) tends to be suboptimal
unless you get the driver thing perfectly worked out for your hardware.
Still, motherboard/chipsets suffer from a lag, especially on
motherboards that add some "differentiating" chips or features (or are
just plain broken relative to spec). Cameras are much better but still
iffy, especially CHEAP cameras. And so on -- ditto with the software
manufacturers tend to release with their hardware. So it has gotten
better, but this gap is still open and very annoying indeed -- more than
enough to keep linux out of the hands of the truly luddite or
computer-challenged who can muddle through with Windows, mostly.
Windows has also closed some gaps of its own in the meantime, becoming
much more stable and much more aggressively updated, although it is
still a virus/trojan/spyware bugfarm if installed without expensive
add-on software watching day and night.
This is probably why Microsoft is trying to attack the HPC market more
than any other reason. It gets them a few headlines that allow them to
convince nervous investors that today is not yet the day that the corner
has been turned and their market share PLUNGES. If and when it becomes
the perception of hardware vendors that (say) 10-20% of the desktop
market share belongs to Linux, it will no longer be so easy not to
provide adequate linux support BEFORE releasing new products, no longer
be so easy not to invest the tiny bit of money needed to get YOUR
hardware's RPMs onto e.g. Livna and tracked by kernel revision number as
needed. The hardware driver gap will rapidly close the rest of the way.
Commercial software developers will at last have to take desktop linux
seriously, and work out some way of selling their products so that
they'll run UNDER an open source environment (where again, yum will Be
Their Friend if they only figure this out).
Now I personally think that by the time you account correctly for all
the linux systems in China, in India, in Brazil, when you start to count
linux systems in University environments correctly even though they are
installed by students directly from a mirrored repo either beside or on
top of an older Windows installation, when you account correctly for all
those "forced" Windows pre-installs being thrown away, that Linux might
well have a global desktop market share of 5-10% already. This isn't
reflected in sales figures, of course -- most of the copies extant were
installed for free and with no direct reference whatsoever to the
originating company (if any). It is all but impossible to measure just
how many systems there are installed in this way without walking the
entire tree -- perhaps checking the yum update repo logs all the way to
the bottom might give you an accurate count -- but then there is debian.
If my surmise is correct, then yeah, duh, Microsoft HAS to do ANYTHING
IT HAS TO to keep it quiet. If you think MS employees have underwater
options now, imagine what they'd be a day after that news hit the
street. However, it is running on empty. Software developers have
learned the hard way that if they ever invent NEW software that is worth
a ton of money, they'd better be prepared to give up 70% of the market
share to Microsoft on demand, although Microsoft has backed off on that
in recent years just a bit and let some smaller markets live unmolested,
maybe as a result of the antitrust suits they "lost" (really, won by
virtue of the wimpiness of the settlements -- a half-billion dollars was
nothing compared to the years of delay and gain in market share).
>> From roughly 28$ to about 22$ right now.
> So earning yourself some options to buy microsoft shares will probably not be
> so popular
> among microsoft employees unlike the past.
> Some things of them are becoming real nerd products, such as their server
> edition 2003
> is an impossibility to be used by normal users.
> Basically i see a few big issues that are all USB related that can crash/lock
> up windows
> completely, apart from that their reliability has been greatly improved. It
> can run without
> crashing for a few weeks now.
> So if they put out a 'cluster product' now, you can laugh, but basically it
> means money for companies
> who produce software to run under that OS, so they WILL produce software for
> the cluster expert edition,
> as microsoft WILL sell ten+ million of copies of their server editions.
Ten million copies? Vincent, if you added up all the top500 systems,
and every system in the top 500 had an AVERAGE of 1000 nodes, it would
still only be 500,000 systems. or a twentieth of that. Even given the
huge IBM clusters and Spain's cluster, I doubt there are a million nodes
in all the top 500 put together, and downhill from that you hit
university clusters and abject poverty. That is, the vast bulk of nodes
in both the top500 and everything else are in environments that would
never, ever pay MS for software on this scale.
They aren't running GUI applications, and if their code is all MPI and
runs just fine on Centos 4.0 or Scientific Linux or Warewulf+whatever,
why in the world would they change? Just e.g. getting cernlib to build
on a system is huge piece of work that has already foiled porting many
of these applications to anything but SPECIFIC versions of SPECIFIC
linux distributions -- porting costs are a major obstacle now as ever.
These folks KNOW that it isn't cheaper to run a Windows network than a
linux network, ever. They KNOW that you cannot possibly install an
operating system in bulk quantities more easily and cheaply than is
possible with Linux right now, which is free (per seat/node) and at the
direct expense of a minute or so of labor per system, if that. They KNOW
that their MCSEs are overpaid and clueless about clustering (unless they
are bright and have been re-educating themselves on the Linux clusters).
There might or might not be ten million cluster nodes in the world, but
MS would be very, very lucky to one day, if they work very very hard and
discount their HPC product to basically "nothing" and/or focus on
commercial markets as previously noted, end up on 10% of them. Nothing
being precisely the marginal cost of most of the products they are
Price DOES matter, in the long run. In the cluster world, Solaris has
pretty much precisely the same advantages and disadvantages as Linux
does -- except for cost. Indeed, arguably most of the original
development of open source PVM and MPIs occurred under SunOS, as it
dominated the workstation market at that time. Great support from true
experts, great market reputation, decent hardware performance wise, and
a huge head start. Now count the number of Solaris/Sparc clusters in
the world compared to Linux/Intel/AMD. Oooo, guess people DO care about
that factor of two or more in price/performance... really really care.
Ultimately, I agree with the earlier assessment (from Jim?), that the
entire HPC market is no more than a pimple on MS's behind as far as
likely contribution to MS's bottom line is concerned, and suspect an
ulterior motive in their entering the market -- like trying to at long
last put a finger in a gradually widening crack in the leaky dike that
supports their product all nice and dry and monopolistic and dominant
inside its protected ring of FUD.
But it's an ocean out there, exerting a quiet and inexorable pressure on
that wall. The wall is crumbling, bit by bit, and every bit that washes
away is very, very unlikely to ever come back. One day, possibly soon,
there will come a storm....
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
More information about the Beowulf