<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">

<html><body style='font-size: 10pt; font-family: Verdana,Geneva,sans-serif'>

<p>Hey Douglas,</p>

<p>Thanks for the information, what has me curious is if it can be used for example in applications which dont involve large amounts of data.</p>

<p>It would be great if you or anyone has any resources like ebooks are useful websites to read up on it would be great if you could send them reason being where I am working we deal with lots of live telemetry in terms of positioning etc. and since we are going to be moving our system away from windows to open source technologies such as angular.js for the web site of our platform as well as mongodb and nodejs, we will be implementing hadoop from amazon to take advantage of Amazon's elastic map reduce.</p>

<div>

<pre>---<br />Regards,

Jonathan Aquilina

Founder Eagle Eye T</pre>

</div>

<p>On 2015-02-07 17:33, Douglas Eadline wrote:</p>

<blockquote type="cite" style="padding-left:5px; border-left:#1010ff 2px solid; margin-left:5px"><!-- html ignored --><!-- head ignored --><!-- meta ignored -->

<pre>Jonathan

I understand your confusion. Hadoop and Big Data have reached

overused but not well understood status years ago.

First, Hadoop started out at a MapReduce engine. This all

changed with Hadoop V2 and YARN (Yet Another Resource Negotiator)

Hadoop V2 can be considered a platform on which applications that need

parallel access to large amounts of unstructured data (i.e. raw data not

in a traditional database. It can also used with its own database HBase,

which is based on Google Big Table.

The idea is this, a "Hadoop" cluster has a large amount of storage

using HDFS (or possibly another parallel filesystem) This is often referred

to as the "Data Lake." Raw data is dumped in the lake. There is no

ETL (Extract Transform and Load) step. Various Hadoop YARN frameworks use

this data. YARN provides a very dynamic resource allocation model and the

ability to provide data locality to your application (i.e. the traditional

MapReduce idea was "move the computation to the data")

Thus in a Hadoop V2 cluster you can have MapReduce applications (which

support many of the the popular apps like Pig and Hive) It also supports

Spark, Storm, Giraph and even MPI (not the most efficient but it works)

There are many other applications being ported to YARN.

Second, Big Data is usually defined by Volume, Velocity, and Variety.

The definition seems to be what ever a vendor wants it to be, however.

It reminds me of products that suddenly became  "grid ready" in years past.

Again such designations mean as much as "now works with binary data"

Finally, if you are interested in Hadoop YARN you can check out the book

"Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with

Apache Hadoop 2" (I helped write it). There also many online resources.

The first chapter of the book has the history of Hadoop as written by

one of the developers. It is quite interested to read and helps dispel

many of the Hadoop myths. You can read this chapter for free here:

<a href="http://ptgmedia.pearsoncmg.com/images/9780321934505/samplepages/0321934504.pdf">http://ptgmedia.pearsoncmg.com/images/9780321934505/samplepages/0321934504.pdf</a>That is enough Hadoop for Saturday morning. Oh, and Hadoop clusters

are not going to supplant your HPC cluster.

--

Doug</pre>

<blockquote type="cite" style="padding-left:5px; border-left:#1010ff 2px solid; margin-left:5px">Can someone explain to me what exactly the purpose of hadoop is and what we mean when we say big data? Is this for data storage and retrieval? Number crunching? -- Regards, Jonathan Aquilina Founder Eagle Eye T -- Mailscanner: Clean _______________________________________________ Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a> sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf">http://www.beowulf.org/mailman/listinfo/beowulf</a></blockquote>

<pre>

--

Doug

</pre>

</blockquote>

</body></html>