[Beowulf] Why I want a microsoft cluster...

Sun Nov 27 10:20:43 PST 2005

> >> why?  because low-end clusters are mostly a mistake.
> 
> I disagree on this for a number of reasons.  It may make sense for some 
> university and centralized computing facility folks for larger machines. 

but my point is mainly that most researcher's workloads are bursty;
burstiness argues for shared resource pools.

>   However, one of the things I have seen relative to the large scale 
> clusters out there is most are designed for particular workloads which, 

but that's silly - you observe that there exist some bad examples,
and generalize to all shared clusters?

> for, simply do not make sense.  Specifically, few informatics folk need 
> low latency interconnect.  They need large number of processors, big 
> memory systems, and large local IO facilities, not to mention very high 
> speed data scatter across their cluster.   This has been recently one of 

sure, no problem - who ever said that a shared cluster must be homogenous?

> >> at a university like mine, for instance, nearly _everyone_ realizes 
> >> that it's
> >> insane for each researcher to buy/use/maintain his own little $50-500K
> >> cluster.  I see three clear reasons for this:
> 
> Again, it is not, if the large cluster was designed for different 
> purposes.  More on that in a moment.

I agree, it would be stupid to design a single, shared, homogenous cluster
for use by many disparate types of researchers.  so just drop your
preconception that clusters need to be homogenous!

> >>         - the maintenance cost of a cluster is very sub-linear.
> >>         - most workloads are bursty.
> 
> Hmmm... the clusters we have put into universities are quite loaded. 

but that doesn't mean that an individual's workload is not bursty.

> >> the first two factors encourage larger clusters; the latter means that 
> >> bursts
> >> can be overlapped in a shared resource very nicely.
> 
> I cannot comment on all bursty natures.  I have seen at some places a 
> bursty load from a resource that was overspecified for the need.  There 
> simply is not enough work to fill the cycles.

I'm failing to express the point clearly: all researchers have an individual
demand which varies from constant to implulse-function-like.  people with 
constant demands are often performing the same crunching on data which 
comes from some source in production use.  for instance, suppose they have 
an NMR instrument which gives them 2 measurements/day, and each requires 
100 cpu hours.  for those people, a non-shared cluster appliance makes 
excellent sense.

in my experience, this is an unusual case.  in the HPC consortium I work for,
researchers have extremely bursty activity.  some of this is the
non-equilibrium nature of the university environment (teaching, rush
revisions for that paper, phases of a thesis, etc).  but mostly it's just 
that people tend to think of something, do it, then have to ponder the
results a bit before they can generate new demand.

> cluster, while they may like it, it is likely that the machine may have 
> been a bit too large for the need.  This is really hard to generalize 

ah!  but you missed the point of having a _shared_ cluster - that bursts
can interleave.  not only can you now keep the cluster busy, but a researcher
can burst to much greater demand than he could justify on his own.

yes, you could lock each researcher onto their own piddly little cluster,
and smooth away their bursts using queueing so that the cluster would come 
close to constant load.  this would be ridiculous!

> about, as most of our customers have computing needs that grow linearly 
> or superlinearly in time.

great; no one said a shared cluster must remain fixed in size.

> > towards the model of the "mainframe supercomputer" in it's special 
> > machine room shrine tended by white garbed acolytes making sure the 
> > precious bodily fluids continue circulating.
> 
> Yes!!!  Precisely.    But worse than this, it usually causes university 

hey, I'm all for decentralized computing _if_ it makes sense.  there's no 
question network+desktops are better than mainframe+terminals.  but that 
doesn't mean that all centralization is bad.

> But worse than this, it usually causes university 
> administrators to demand to know why department X is going off on their 
> own to get their own machine versus using the annointed/blessed hardware 
> in the data center.  As someone who had to help explain in the past why 

take a look at the fine print on your grant - the university is obliged to 
ensure that grant money is spent effectively.  that clearly does mean that 
every department/group should not go off and buy their own little toy.
make that "underutilized, poorly configured, mal-administered toy".

> Department X may be getting its own machine due to the central resource 
> being inappropriate for the task, or the wait times being unacceptably 
> long, or the inability to control the software load that you need on it.

then your central IT people are incompetent - fire them, since their purpose
is to serve you.  it's just silly to claim that no centralized facility can 
be responsive and appropriately configured.

> folks manage the resource, and allow cycles to be used by others when 
> the owners of the machine are not using so much of it.   Moreover they 

cycle scavenging is better than nothing, but only a way to mitigate a bad
solution.  and it really only helps people who have least-common-denominator
kinds of jobs (basically, small-model MC).

> Forcing all users to use the same resource usually winds up with 
> uniformly unhappy users (apart from the machine designers who built it 
> for a particular purpose).

this is overly pessimistic.

> > There is, I maintain, a real market for smallish clusters intended to be 
> > operated by and under the control of a single person.  In one sense, I'm 

unless the person has constant demand, I disagree.  OK, so let's start with
your personal-super.  now, is there some reason why its location matters?
perhaps you're using it for really high-end parallel-rendering.  OK, great.
you almost certainly aren't.  suppose you were happy with a single gigabit
connection to the cluster - is there any reason not to locate it in a central
machineroom?  chances are excellent that power and cooling will be more 
effective than your lab/office.  now, suppose you had exclusive/dedicated
access to the same-sized chunk of a scaled up version of the cluster.
identical components, just your own little partition.  would you be happy?
one last question: do you mind of someone else uses your chunk of the 
cluster when you aren't?  BINGO - you have just achieved what I'm talking 
about: a shared resource pool that is co-located to make things like 
better networking available, minimize infrastructure costs, etc.  there's
absolutely no reason such a shared cluster can't incorporate all your choices
of software, the hardware tuned to your application, etc.

>   Second, the large market is shrinking.  Again, this is not a slap 
> against Jim and Mark.  The real HPC cluster market is moving down scale, 
> and the larger ones are growing more slowly or shrinking.  This is going 
> to apply some rather Darwinian forces to the market (has been for a 
> while).

I believe this is mostly an artifact of the kind of lumping-together that 
market-research companies do.  it's clear that the previous model for
supercomputers (cray/vector/etc) is disappearing.  and it's clear that 
commodity-cluster HPC is going to replace it.  since the research is lumping
together a decreasing number of $1e7 machines with an increasing number of 
machines built from 1e3 blocks, wow, the average is decreasing.  that doesn't
actually tell us much.

> Its real.  Its just not growing in dollar volume.  Some of us wish it 

largely because components are getting cheaper, I believe.  again, the 
lumped statistic is hiding multiple things happening: no doubt there is 
some increase in the number of personal clusters (say, <= 32 cores),
simply because they're so damn easy to do.  it's also clear that traditional
shared clusters are increasing in size.  and that the even older tradition
of vector/SMP machines is decreasing rapidly.  averaging such a bumpy
distribution together does _not_ provide insight...

> would, but budgets get trimmed, and hard choices are made.  You have to 
> go through fewer levels of management to justify spending 50k$ than you 
> do 500k$, or 5M$.

right - natural and often justifiable laziness leads to random personal
clusters.  that doesn't mean that large, shared clusters are bad or hard.

> Yup.  A 100k$ cluster appliance won't sell well according to IDC.  If 
> your appliance (board/card/box/...) sits in the 25-50k$ region and 
> delivers 10-100x the processing power of your desktop, then you should 
> see them sell well.  This is where the market growth is.  Companies 

personal clusters are the new workstation.  no real surprise there.
you always provide your people with appropriately-sized productivity tools.
the question is: how much waste due to fragmentation can you tolerate, 
given the rather extreme benefits to pooling cluster resources?

> windows upon the users.  The demo at SC was ok, but you can do stuff 
> like that today with web services systems.  The issue is that there are 
> no real standards associated with that.   Not w3c stuff, but higher 

the worst part of "web services" is that it's an empty standard, like XML.
they're both merely lexical and syntactic, not semantic.  just because your 
soap signatures match doesn't mean that the service does what you want.

> This was a mistake on the part of these people.  Microsoft has 
> effectively unlimited resources, and significant patience.  My take on 

no thanks.  there are fundamental and valuable differences between the 
MSFT way and the unix way.  MS's success in the server market is mostly
just a consequence of the attack of the killer micros, not any special
excellence from Redmond.