[Beowulf] Hadoop
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Gerry Creager gerry.creager at tamu.eduSat Dec 27 07:59:54 PST 2008
- Previous message: [Beowulf] Hadoop
- Next message: [Beowulf] Hadoop
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jeff, I'm an old, guy and don't mind top-posts! Thanks for the insight! gerry Jeff Layton wrote: > Sorry for top-posting (I hate these on-line email tools...) > > Did the person requesting Hadoop ever say why they wanted it? For > example, do they have code written in MapReduce or do they think that > Hadoop will give them faster throughput than something else? > > Hadoop is a project that really has 2 parts to it - an open-source > MapReduce implementation, and a file system. From people I've talked to, > the MapReduce part is used far more than the file system. But I've > talked to some of the developers of the file system and there are some > people who use the file system. > > In general the file system is basically a virtual file system ala' PVFS, > GlusterFS or any object based storage (Panasas, Lustre). However it > understand the idea of locality - that is where useful storage is in > relation to the compute part of the problem. The idea being that you can > reduce the time to transmit the data because the storage is closer. But, > in general, the improvement you get is due to the network topology, not > necessarily the file system itself. That's because, in general, > MapReduce systems have network topologies with bottlenecks all over the > place because they don't really need a full bi-sectional bandwidth > network everywhere. So for example they may have good bandwidth to a > switch within the rack, but outside the rack, they bandwidth is not so > hot. But again, these are generalizations, and the details are always in > the implementation. > > HadoopFS (lack of a better phrase on my part) is really designed for > MapReduce codes - transactional codes. So if the person's code(s) fit > this model, then it might be an interesting experiment to try. > Otherwise, there are much better file systems for HPC :) > > BTW - I saw Karen's post about using Java with HadoopFS. Be sure to pay > attention to that since getting a good 64-bit Java implementation for > Linux is not always easy. There are a few out there (Sun has an early > access program to a 64-bit Java) but the reports I've heard are that > it's still early. > > Hope this helps. > > Jeff > > > ------------------------------------------------------------------------ > *From:* Gerry Creager <gerry.creager at tamu.edu> > *To:* Beowulf Mailing List <beowulf at beowulf.org> > *Sent:* Friday, December 26, 2008 6:16:04 PM > *Subject:* [Beowulf] Hadoop > > The subject line says it all: Hadoop: Anyone got any experience with it > on clusters (OK, so Google does, but that really wasn't the question, > was it?). > > We've a user who has requested its installation on one of our clusters, > a high-throughput system. I'm a bit concerned that it's not gonna be > real compatible with, say, Torque/Maui and Gluster, unless we were to > install Xen across the whole cluster and instantiate it within Xen VMs. > > However, before I push all MY fears out into the discussion I'd prefer > to see if anyone else has experience and can shed light on compatibility. > > Thanks, Gerry > -- > Gerry Creager -- gerry.creager at tamu.edu <mailto:gerry.creager at tamu.edu> > Texas Mesonet -- AATLT, Texas A&M University > Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 > Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org <mailto:Beowulf at beowulf.org> > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager at tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843
- Previous message: [Beowulf] Hadoop
- Next message: [Beowulf] Hadoop
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
