[Beowulf] [EXTERNAL] Re: ***UNCHECKED*** Re: Spark, Julia, OpenMPI etc. - all in one place

Lux, Jim (US 7140) james.p.lux at jpl.nasa.gov
Wed Oct 14 12:01:38 PDT 2020



From: Beowulf <beowulf-bounces at beowulf.org> on behalf of Oddo Da <oddodaoddo at gmail.com>
Date: Wednesday, October 14, 2020 at 6:05 AM
To: Michael Di Domenico <mdidomenico4 at gmail.com>
Cc: "beowulf at beowulf.org" <beowulf at beowulf.org>
Subject: [EXTERNAL] Re: [Beowulf] ***UNCHECKED*** Re: Spark, Julia, OpenMPI etc. - all in one place

On Wed, Oct 14, 2020 at 8:42 AM Michael Di Domenico <mdidomenico4 at gmail.com<mailto:mdidomenico4 at gmail.com>> wrote:
i believe your "lack of progress" statement is really just a
misunderstanding of what mpi really represents.  to me MPI is like the
flathead screw, they've been around a long time and there are
certainly a ton of alternate head designs on the market.  however,
wooden boat builders still use them because they just work, they're
easy to make, and they're easy to fix when it comes time to repair

I see MPI as a low level solution for a problem, at the abstraction level where you need to spell everything out. It is like the comparison between C and languages like Scala or Haskell or Julia. I am asking why there is no progress on the latter in this scenario - we have the message passing interface level of abstraction, why are we not interested in using this to build tooling that is at the higher level, where we can hide the how and focus on the what.

you're equating a 20yr old book with lack of progress and frankly i
think that's a flawed statement.

20 years ago, when I was an undergrad, I took a 200-level course in data structures and algorithms and it was taught in a programming language called Eiffel. The professor started with saying - "Eiffel is gaining in popularity, there are many new books being written about Eiffel but none about C. Do you know why?". I raised my hand and said "because C is an old language and nothing has changed and everything that was to be said about it, was already said about it in many previous books". At least in that domain we could discuss these things - new languages and paradigms and tooling abound. In the world of HPC, not so much.

---

Perhaps, for most problems, there’s no “burning need” for the generalized layers on top of MPI – for a lot of problems, there are libraries that use MPI underneath to compute “large things” – so if your problem involves, say, inverting giant matrices, you just call the “invert matrix” function.  You neither know nor care that MPI is there, and there’s not much to be gained by having a new language that has a syntax for matrix inversion, as opposed to a function call. My impression is that a whole lot of HPC computing is “not complex” in the algorithm sense – it’s just complex because of needing to spread it out across multiple nodes to get an answer in a reasonable time.  Large matrix based models as used for weather, for instance – the “per voxel” code is pretty straight forward, it just needs to interact with all the neighbors, deal with scale changes in multi-grid implementations, and somehow spread the work out efficiently.  Once someone has done the “parallelization” work, nobody working with the code needs to think about it again – there’s no new language construct (or a whole new language) that makes their job easier – to the contrary, it would make their job harder.

If your job is to go in and make some change to a big weather model, you’re modifying an existing, validated code base.  There’s not a lot of call for creating such a modeling code from scratch.

For instance, in my own area, numerical electromagnetics, a lot of people use NEC, still in Fortran – now getting on towards 40 years after the first EM codes were written. Is there a *good* reason to change? Nope. The code works well, I can run thousands upon thousands of iterations, easily spread across many nodes to reduce “end user perceived run time”.  I’ve not tried a gigantic model (say, a million wires) which might break the back end (I suspect, though, that BLAS can handle inverting a million by million matrix, and it might even have MPI implementation to spread it) (just checked, yep, ScaLAPACK, by our long time list friend Jack Dongarra, and others).


Where I’ve seen new software written in the area is in things like transforming 3D models into grids appropriate for a computational backend.  And there, new languages might help.  But gridding isn’t computationally intensive like the actual simulation is – yes it’s complex, yes it takes a while to run, but there’s no need to ingest a giant STEP file from Solidworks and spread the gridding across multiple nodes.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20201014/8964cba1/attachment.html>


More information about the Beowulf mailing list