[Beowulf] Beowulf Cluster VS Hadoop/Spark

John Hanks griznog at gmail.com
Thu Dec 29 23:47:58 PST 2016


This often gets presented as an either/or proposition and it's really not.
We happily use SLURM to schedule the setup, run and teardown of spark
clusters. At the end of the day it's all software, even the kernel and OS.
The big secret of HPC is that in a job scheduler we have an amazingly
powerful tool to manage resources. Once you are scheduling spark clusters,
hadoop clusters, VMs as jobs, containers, long running web services, ....,
you begin to feel sorry for those poor "cloud" people trapped in buzzword
land.

But, directly to your question what we are learning as we dive deeper into
spark (interest in hadoop here seems to be minimal and fading) is that it
is just as hard or maybe harder to tune for than MPI and the people who
want to use it tend to have a far looser grasp of how to tune it than those
using MPI. In the short term I think it is beneficial as a sysadmin to
spend some time learning the inner squishy bits to compensate for that. A
simple wordcount example or search can show that wc and grep can often
outperform spark and it takes some experience to understand when a
particular approach is the better one for a given problem. (Where better is
measured by efficiency, not by the number of cool new technical toys were
employed :)

jbh

On Fri, Dec 30, 2016, 9:32 AM Jonathan Aquilina <jaquilina at eagleeyet.net>
wrote:

> Hi All,
>
> Seeing the new activity about new clusters for 2017, this sparked a
> thought in my mind here. Beowulf Cluster vs hadoop/spark
>
> In this day and age given that there is the technology with hadoop and
> spark to crunch large data sets, why build a cluster of pc's instead of use
> something like hadoop/spark?
>
>
> Happy New Year
>
> Jonathan Aquilina
>
> Owner EagleEyeT
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
-- 
‘[A] talent for following the ways of yesterday, is not sufficient to
improve the world of today.’
 - King Wu-Ling, ruler of the Zhao state in northern China, 307 BC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20161230/ee2f4369/attachment.html>


More information about the Beowulf mailing list