[Beowulf] SC13 wrapup, please post your own

Sat Nov 23 12:01:26 PST 2013

On Nov 23, 2013, at 1:40PM, Joe Landman <landman at scalableinformatics.com> wrote:

> That is, we as a community have much to offer the growing big data 
> community.  

I think this is completely true, and somewhat urgent.  The two communities have a lot to teach each other.

The big data community remains incredibly naive about a lot of performance/scalability issues - and of course they are, they’ve only been at this a few years.  Traditional HPC has a *lot* of hard-won knowledge and experience to offer.

But conversely, where we’ve been naive is the importance of easily deployable, scalable, easy-to-develop-for software frameworks, even if it initially comes at substantial cost in terms of single-processor performance.  If we choose not to learn the lessons of rapid growth of tools like Hadoop, we are in trouble as a community.  

We’ve talked for years about how hardware is advancing more rapidly than software, but not done much about it; now someone has, and it’s not us.  As a result, people are already trying to fit very HPCy sorts of problems into Hadoopy sorts of frameworks (cf, all the BSP stuff in Pregel or Hama) because it’s so much easier to get things working, and so much easier to find developers to maintain.  When it comes to choosing a direction for a new project, 100x the number of developers will always win over single-processor performance, or even scaling, because you can then direct enormous amounts of resources to fixing performance issues in the underlying frameworks.  

    Jonathan

-- 
Jonathan Dursi, <ljdursi at scinet.utoronto.ca>
SciNet HPC Consortium, Compute Canada
http://www.SciNetHPC.ca
http://www.ComputeCanada.ca