[Beowulf] Scientific computing's future: Can any coding language top a 1950s behemoth ?

Sat May 17 09:42:18 PDT 2014

On 05/17/2014 10:34 AM, Lux, Jim (337C) wrote:
>
>
>
>
> On 5/13/14, 4:48 PM, "Ellis H. Wilson III" <ellis at cse.psu.edu> wrote:
>
>>
>> Wrapping this back into the original issue (next-gen HPC languages), I
>> think the core issue is non-programmers programming.  <begin
>> generalization>  They don't really want to program.  They're doing it as
>> a means to an end.  They'd be more than happy to write equations in lieu
>> of routines, as the article alludes to. <end generalization>
>
> Actually, I think this is the core thing.  For most people, they are
> interested in doing their job, not programming, whether they are just
> typing a book report or doing a full scale simulation of the earth¹s
> atmosphere.  The programming is a means to an end.
>
This is true, but this is the wrong attitude, and I think it's a result 
of both the educational system, and extreme that the 'publish or perish' 
academic paradigm has evolved into. It's also a lesson in false economies.

Why the educational system? Well, the older scientists I've worked with 
know the minutiae of all aspects of computing - programming languages, 
processor designs, even assembly. Since this is pretty common with 
scientists of a certain age, I assume that in years past learning this 
computer science went hand in hand with learning their science ( 
physics, etc.) The younger scientists I work with all seem to have lousy 
programming skills, no respect for the the details of computing, prefer 
to work in MATLAB (or similar), and if they run into a problem, show 
little interest or know-how in getting to the root of it.

I wouldn't call the above statements generalizations as much as trends 
I've noticed over the past 10-15 years. There could be other causes, but 
I think the main reason for this is that schools are no longer putting 
enough emphasis on understanding computers as a tool of research. Now 
that we things like MATLAB and Mathematica, why waste precious 
credit-hours teaching them how to program?

I know this to be true based on my curriculum as a Chemical Engineering 
student 20 years ago compared to the curriculum at the same school 
today. When I was a student, all engineers had to take Fortran for 
Engineering their freshman year. My numerical methods class, which was a 
Chemical Engineering class taught by a Chem Eng professor, taught us how 
to do matrix operations and ODE integrators line-by-line. I'm pretty 
sure they don't teach Fortran to engineers any more, and I know that the 
numerical methods class is based on MATLAB now. I wouldn't be surprised 
if that numerical methods class is completely eliminated in the next few 
years.

I think this is wrong. If you are carpenter, your main goal is to build 
a house, but I'm sure you are still taught how to use every tool in your 
toolbox safely and effectively in vocational school. Who would hire a 
carpenter who was never taught how to cut crown moulding properly with a 
mitre saw? Hiring a scientist or engineer who doesn't know how to use 
computers effectively is the same thing.

I find that in the current academic research environment, the SOP in 
many places is to crank out crappy code, get results as quickly as 
possible publish them, and move on to the next paper,all with the goal 
of cranking out as many papers as possible in order to get as many 
grants as possible. I equate this to be like a restaurant where they try 
to increase revenue by rushing customers through their meal so they 
turn-over as many tables as possible in an evening, with no concern for 
the dining experience.

I say this is a false-economy because the real goal is time to solution, 
not time to code completion, but I've seen many situations where people 
write code quickly, and then their simulation runs for a month. Another 
person comes a long, makes some minor changes, and the simulation now 
completes in less than a day. Now multiply that time difference by 
multiple simulations... If they spent a little more time learning about 
the art of coding, or just learning some 'best practices', and spending 
a few more days of coding, they could save themselves weeks or months of 
time waiting for results. Or even years over an entire career, but few 
ever see it that way.

>
>
>
>
>> Therefore,
>> maybe, instead of continuing to attempt to create the "perfect language"
>> that fits their needs,
>
> The challenge is that there are so many problem domains that what you
> really need is a custom language tailored to each of them.  And isn¹t that
> what we have with large subroutine libraries and what not? Someone who is
> stringing together calls to library routines is basically programming in a
> domain specific language (with a strange hybrid of the syntax and
> semantics of the underlying implementation language, rather than something
> that is domain relevant).
>
> Or, what you see is domain specific pre and post processors for the
> underlying numerical computations.  For the Numerical Electromagnetics
> Code there¹s dozens of preprocessors and post processors ranging from
> Excel spreadsheets and macros to dedicated graphics editors and
> sophisticated plotting programs (since the underlying code is really
> looking for 80 column input records and generates 132 column output
> files).  But those pre-post processors are sort of narrow, and don¹t
> really rise to a ³programming language² in that they have a fairly
> simplistic architectural model.  They provide some basic iteration and
> computation syntax (e.g. One can systematically change the length of an
> element of the model and get a summary of the output), but it¹s not like
> you can actually do ³programming²  You couldn¹t write a customized
> optimizer using the pre, post processor capabilities.
>
> The same is true in structural analysis and in computational flow and, I¹m
> sure, although I have no experience in it, with computational chemistry.
> Anyone who is doing lots of this kind of thing has the basic validated
> simulation codes and a huge toolbox of modeling and analysis stuff.  Maybe
> it¹s programs that take a solid model and automatically generating the FEM
> grid. Maybe it¹s a program or routine that takes the raw analysis output
> and generates output in a particularly useful format (domain specific).
>
> People optimizing race cars do not literally re-invent the wheel model
> each time they do a new simulation and analysis.
>
>
>> maybe the better solution is to teach them the
>> tenets of proper programming so they can grasp the process and instruct
>> them on ways to write very clean and elegant design documents.  Sure, in
>> some cases that may take as long just to get the design doc done as it
>> would for them to just code it, but in the long run if said code gets
>> wrapped into a larger project (or grows into one) it will result in far
>> less maintenance and complexity than having 10 physicists and 10 CS
>> folks both playing with the code simultaneously.
> Never going to happen: a lot of scientific computation is done by
> incremental development without a clear picture of where the end point is.
>   You write some Matlab code to analyze some raw data you collected.  Hmm,
> that looks interesting, so you graft on another step of processing.  That
> looks better, but, hey, this aspect is interesting now, so you write some
> more code to do the processing needed.

This is very true.
>
> Rarely does someone start out with a clean sheet and say ³I¹m going to
> write a numerical simulation of the weather², because that would be a
> herculean (and expensive) task.  Particularly in what I¹ll call scientific
> computation, the government funded development process is characterized by
> receiving relatively small amounts of budget over many, many years.  If
> you go to the NSF site for instance, and look at the several dozen awards
> for climate and large scale dynamics, you¹ll see that they¹re pretty much
> all in the ³few $100k² range.  Those PIs receiving the funds are
> interested more in the science than in the software engineering (it is the
> National *Science* Foundation, after all)
>
> It is possible that there is significant commercial development of these
> sorts of models (almost certainly the case in the biotech field) and I
> would imagine that they DO actually use better design processes.
>
> And for something like NASTRAN, where there is a clearly identified large
> scale need, it could get funded with a larger chunk of change, and
> hopefully use decent software engineering.
>
>
>
> James Lux, P.E.
> Task Manager, FINDER  Finding Individuals for Disaster and Emergency
> Response
> Co-Principal Investigator, SCaN Testbed (née CoNNeCT) Project
> Jet Propulsion Laboratory
> 4800 Oak Grove Drive, MS 161-213
> Pasadena CA 91109
> +1(818)354-2075
> +1(818)395-2714 (cell)
>   
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Prentice