[Beowulf] The True Cost of HPC Cluster Ownership

Joe Landman landman at scalableinformatics.com
Tue Aug 11 20:54:36 PDT 2009

Rahul Nabar wrote:
> On Tue, Aug 11, 2009 at 11:16 AM, Joe
> Landman<landman at scalableinformatics.com> wrote:
>> There is a cost to going cheap.  This cost is time, and loss of
>> productivity.  If your time (your students time) is free, and you don't need
>> to pay for consequences (loss of grants, loss of revenue, loss of
>> productivity, ...) in delayed delivery of results from computing or
> (1) Why always consider it a "loss" of your student's time? I was one

Time is a zero sum game irrespective of how much coffee you consume.  If 
you wind up spending large fractions of your time on computing, you 
spend less time on research.  Students as cheap/free labor means they 
aren't getting their research work done (unless their research is on how 
to build/maintain the cluster).

> such "student" think there is enormous learning potential here. Of

Yes, there is much to learn.  Even some meta-learning, such as when not 
to spend time on things.

> course, my systems never did match the uptime / performance of a
> "turnkey" solution but the skills learnt in setting one up are rarely
> gained otherwise. At a university research is one goal; but learning
> is definitely another.

Hmm.... usually the process of research and the process of learning went 
hand in hand.  I agree that people *should* get a grounding in all 
aspects of their research, and should get their hands dirty to a degree. 
  But you shouldn't have them spend the time they should be doing 
research in focusing exclusively on managing resources (unless you are 
trying to teach them how to do time and resource management, which is a 
very important skill for scientists).

> (2) A key problem that I don't know how to work around for turnkey
> solutions: "How do I pharase the contract and performance gurrantee so
> that I get the vendor to do all the things that I want?"

It starts out with you defining the goals you wish to achieve, and then 
working through the path to achieve them.  Decide which portion you wish 
to do, and find a partner to help you do what you don't want them to do. 
  A good vendor *will* partner with you to solve real problems.

> Many of us run codes that are not very high volume nor very
> standardized. Everybody wants to tweak and do something new.
> Especially in research. In a such a scenario I don't want the vendor
> to "just give me boxes with an OS" but also get my code installed,

We like to get our hands on the code early, with test cases, so we 
understand how it will behave on the hardware before our customer gets 
it ... precisely so we can help answer questions, and solve problems.

Most vendors just want to deliver the boxes.

> compiled, running and optimized. Plus schedulers and some such. Not
> just install them but set up fairshares that reflect user situations.

:)  I sometimes joke that you know you have your scheduler set up right 
when everyone hates you.

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

More information about the Beowulf mailing list