[Beowulf] search engine
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduTue Jan 4 10:14:42 PST 2005
- Previous message: [Beowulf] search engine
- Next message: [Beowulf] Scalapack with Pathscale compiler
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, 4 Jan 2005, Noel Tanmoy Das wrote: > how can i build a search engine (e.g. something like google) in a > beowulf cluster? help wanted. Wrong cluster type. This is called a "high availability" type cluster, although it certainly shares a lot of features with beowulf or HPC clusters. There are several answers possible here. One is to contact google and buy/rent their engine. It is a very, very good one and for a professional enterprise project that requires an internal/private search engine well worth the cost. A second one (if all you want to do is let people search for stuff you have up on a big website) is to use google for free -- it is fairly trivial to add a google box to any web page. If you want to WRITE an open-source search engine to e.g. COMPETE with google -- well, using google with something like "search engine open source" as the string turns up a list of free and open source tools at e.g. http://www.searchtools.com/tools/tools-opensource.html. I'd look over these projects, pick the best one that has the most active group working on it, and join the project rather than starting your own from scratch. It is very likely that one or more of the projects listed on this page already run on a cluster of some sort, as building and searching a very, very large database is a task with lots of natural parallelism. It is also very nontrivial -- I couldn't begin to tell you exactly how it all works as I don't know. To me google is just plain black magic -- it seems to crossreference EVERYTHING on the web all the way down to fairly deep embedded text (at a guess, well over a petabyte of distributed data) and still returns hits on most searches in a matter of seconds, no matter what the search string and no matter when you use it. It's like a tiny piece of the mind of God... or if you prefer a less blasphemous metaphor derived from "The Lucifer Principle", it is the memory function of the extended neural network that forms the superorganism known as "The Web", where we, and the websites we contribute and maintain, are the neurons themselves. If the human race has a developing collective intelligence, this is a core piece of it. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: [Beowulf] search engine
- Next message: [Beowulf] Scalapack with Pathscale compiler
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
