[Beowulf] ***UNCHECKED*** Re: Spark, Julia, OpenMPI etc. - all in one place

Michael Di Domenico mdidomenico4 at gmail.com
Wed Oct 14 10:22:47 PDT 2020


On Wed, Oct 14, 2020 at 11:53 AM Oddo Da <oddodaoddo at gmail.com> wrote:
> I did not use Spark or Scala as measures of greatness but they are evolution, at least people are trying ;). Not all evolution is in the positive direction, of course. But I do think that the world of software engineering has moved/changed for better since 1990s. Yes, we built software just fine in the 1990s and we built it fine in the 1960s but that is like saying we drove cars just fine in the 1930s, why do we need new cars.

I don't see it that way.  to me things like hadoop/Spark/etc were
designed to solve a specific problem other paradigms couldn't (or
rather shouldn't).  it's not evolutionary, it's something new.  your
analogy of cars is nice, but i believe it's more reflective of MPI
progressing from MPI1 -> MPI2 -> MPI3 than how the entire industry has
moved.  cars have gotten better since the 1930's, but they still have
four wheels and a motor

i'll agree that in some respects software engineering has gotten
better in the last 20yrs, but it's subjective.  there are a lot of
things that have gotten better and there are a lot of things that are
much worse.  but i'm not sure you can apply that statement to HPC.
HPC code doesn't churn like business code or even more volatile cloud
code.  HPC code is usually written to solve something specific and
gets incremental updates over time.  usually that something specific
hasn't changed the last 20yrs (think physics/chemistry) the models we
use to describe or solve the problems likely have, but the underlying
code is probably basically the same with tweaks along the way to fit
the new model.

re-writing the code in a more modern language because the language
allows you to use more natural way to describe the problem doesn't
make it less complex, it just hoists the problem onto someone else (ie
the code under the code).  and you're throwing away 20yrs of coding
knowledge and stability.  technical debt is probably the biggest
preventer in new language adoption.  the ML/AI frameworks weren't born
in the MPI world, but are adopting it because no one wants to or sees
a reason to re-write that code.  there's nothing that stops something
like pytorch from hooking the low level RDMA libraries and sending
messages around the network, but MPI/charm/upc/shmem already does it
so why reinvent the wheel

> I stated in the original post that I am coming back into the field and that it feels like not much has changed since 20 years ago. I asked to be schooled or corrected. I suppose it would be alright for you to tell me specifically what has changed for the better (or has been a paradigm shift) and how widely have these changes been adopted. For example, I consider things like chapel evolutionary changes in the right direction even if it may not be adopted widely - at least someone is trying ;). What you seem to be saying is that we don't need anything new, C and MPI are all that is necessary and we are happy with it?

i'm not sure i'd go as far as to imply that.  there are definitely
things that could be fixed with C and MPI that many people are unhappy
with.  but there's already a huge investment in the platform as it
sits and good change I believe must be done slowly.  chapel is a good
example that supports your statements and i would also consider this
an evolution to MPI.  but it goes back to technical debt.  to re-write
something in chapel is non-trivial and may not be worth the time.
writing something new and choosing chapel is really left up to the
developer.  i have some chapel users here and there, but they're a
minority.  and since chapel is largely only found on cray machines its
exposure is low

i'm not sure the philosophical debate you're looking for is one that
can take place.  like vim vs emacs or init5 vs systemd.  everything
exists and it usually boils down to personal choice.  i run a fairly
large hpc center and "user written" C/MPI code really only represents
<20% of my workload.  but that's subjective.  i'd bet if the beowulf
list did a poll you'd find heavy slants based on user base.  if you
feel the industry hasn't moved, maybe thats just where you are
working, what you're doing, or who you're working with rather than a
representation of the hpc industry.

i still think you're trying to compare two things that shouldn't be
compared.  MPI isn't a programming language, it's a library.  if you
want to debate programming language evolution, that's a totally
separate discussion from one that includes MPI/Spark/Etc


More information about the Beowulf mailing list