[Beowulf] Clustering vs Hadoop/spark
jaquilina at eagleeyet.net
Tue Nov 24 08:19:52 UTC 2020
Readded the list
I think where im confused is that to me doesn’t that what Hadoop/Spark does distributes the data for computation then aggregates it back into a single data set?
Correct me if I am wrong here.
Also another thing I cant seem to understand is how for big data analytics a java based platfrom manages to get some great performance to crunch large data sets.
From: Benjamin Redling <benjamin.rampe at uni-jena.de>
Sent: 24 November 2020 09:03
To: Jonathan Aquilina <jaquilina at eagleeyet.net>
Subject: Re: [Beowulf] Clustering vs Hadoop/spark
On 24/11/2020 06.22, Jonathan Aquilina via Beowulf wrote:
> I am just wondering what advantages does setting up of a cluster have
> in relation to big data analytics vs using something like Hadoop/spark?
can you distribute any application without programming against a framework?
We distribute a lot of data parallel tasks with the source code unchanged via SLURM.
FSU Jena | JULIELab.de/Staff/Redling
☎ +49 3641 9 44323
More information about the Beowulf