Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Why one might want a bunch o' processors under your desk.

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Vincent Diepeveen diep at xs4all.nl
Tue May 10 05:42:39 PDT 2005


At 05:34 PM 5/9/2005 -0700, Jim Lux wrote:
>At 01:40 PM 5/9/2005, Vincent Diepeveen wrote:
>>At 05:49 PM 5/6/2005 -0700, Jim Lux wrote:
>> >Today I was running a lot of antenna models, using a method of moments
code
>> >called NEC4 (in FORTRAN).
>> >Just to describe the computational task for context:
>> >
>> >The antenna I am modeling is 9 patches, in a square grid, the middle
one of
>> >which is excited.
>> >
>> >
>> >
>> >What I DON'T want to do is rewrite (or even recompile) the antenna
modeling
>> >code. It works, it's been validated, it's been optimized (to a certain
>> >extent), and besides, my job is to use the code, not to rewrite it for
>> >parallel computing.
>>
>>You know, i can get very sad reading that.
>>
>>I worked for 1.5 years real hard (i have worked several months, 7 days a
>>week, from 9 AM to 11 PM or later even) to get a hard to parallellize
>>algorithm to work on a 512 processor SGI origin3800, without being able to
>>test on the machine.
>>
>>If you can get system time on a 1024 processor machine for how many cpu
>>hours is it? That means that the organisation in question is spending on
>>you tens of thousands of dollars of system time and probably even more to
>>salaries of the organisations guarding the machine.
>>
>>You aren't even prepared to do hard work to let the program run more
>>efficient within the system time given?
>>
>> >And yes, there are approximations, better modeling codes, etc.
>> >available.  But again, I'd like to avoid having to track them down,
>> >validate them, and so forth. I want to run my tried and true (but slow)
>> >code, faster.
>> >
>> >I suspect that I am not alone.  There are probably hundreds of people who
>> >have similar kinds of problems, and would be well served by a desktop or
>> >personal supercomputer.
>> >
>> >Flame On!!
>>
>>If you are not prepared to modify the software,
>>then basically i'm missing the point of the problem presented.
>>
>>Any way to run it more efficient involves re-programming the software.
>>
>>Matrix type stuff is very well possible to parallellize.
>
>Actually, this describes the basic problem in the high performance 
>computing area very well.. The people who have jobs that "need" HPC don't 
>have the skills or time or resources to modify their code to use some 
>particular computational resource.
>
>So you have a resource (a very high performance computational system) that 
>goes begging looking for work, because there's some other "non-free" 
>resource needed to effectively use it (that is, skilled software people). I 
>should point out that JPLs 1024 processor Dell Xeon cluster is actually 
>heavily used, as are the Cray and the SGI machines, so my comments are of a 
>general nature.
>
>And, yes, the organization IS paying hundreds of thousands of dollars to 
>provide a shared resource, just as it pays for the buildings, the library, 
>and so forth.  And, none of these resources are "free", even if they come 
>as part of the institutional overhead.
>
>But, at some point, you have to decide whether to allocate your resources 
>to developing software, or working on your particular problem, for which 
>the software is merely a tool.  You do a cost benefit analysis: do I spend 
>a work month of time on parallelizing some code, so that the remaining 4 
>months worth of work takes only 2 months? Or, do I just soldier on with the 
>old slow code, and adapt my working style to making overnight runs.
>
>Then, there's also the situation that even if you DID have the money, you 
>might not have the people resources. It's very difficult to "buy" a few 
>weeks' time of a skilled developer. If they're skilled, they're probably 
>busy and fully subscribed. If I have to wait a month for them to fit me 
>into their schedule, I might as well have been running the old slow code, 
>and be partway to my end point.
>
>And then there's the granuarity of purchase problem.  If the 10 skilled 
>developers are already fully occupied, my little one work month increment 
>of work would require hiring a whole additional person, which my little 
>research task could not afford.
>
>Add to this the fact that for most codes, it would probably take many many 
>work months to significantly improve and modify them. It's a full time job 
>in itself. And that's assuming that you have sufficient visibility into the 
>code to do it.  What if you're stuck with a tool that is ONLY available as 
>a compiled program (and such things are not particularly 
>uncommon).  Imagine trying to modify OpenOffice to use Base 9, instead of 
>Base 10.  Sure, the source is available, and the actual change might be 
>quite simple, once you knew where to change it.  The problem is that it 
>would probably take you a year to find the 4 or 5 essential routines, and 
>to make sure that everything still worked after you were done.
>
>
>So... the trick is to find a way to make cluster (or super) computing 
>usable in a transparent fashion?  This is one reason why people buy 
>mainframes, after all.  You can run the same old code, faster. It's the 
>original concept that Cray had.  Run your unchanged FORTRAN program, a LOT 
>faster.  It's the original concept behind a system I worked on back in the 
>80s, where the idea was to build a 80286 emulator out of fast ECL, so that 
>IBM PC software could be run lots faster.  Not particularly clever, but 
>still, elegant in a kind of perverse way.
>
>If the reconfiguration extends to maybe an hour or two of setting up 
>(because that's essentially what it takes to install a new software 
>package), you'll find that people are willing to do it.  But if it takes 
>weeks and weeks, you'll not get many takers.
>
>It's not laziness, nor a lack of desire, just a lack of appropriate
resources.

Honesty, if you ask me, the only reason it happens is because the
government pays the bill and not you.

>
>
>
>
>
>
>



More information about the Beowulf mailing list