Beowulf: A theorical approach
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Nacho Ruiz iorfr00 at student.vxu.seThu Jun 22 00:20:33 PDT 2000
- Previous message: quick question
- Next message: Beowulf: A theorical approach
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, I'm doing my final year project and I'm writting about the Beowulf project an the Beowulf clusters. I've been reading several documents about the beowulf clusters, but I would like to ask all of you some questions about them. As I've seen the main objective behind any Beowulf cluster is the price/performance tag, specially when compared to supercomputers. But as network hardware and commodity systems are becoming faster and faster (getting closer to GHz and Gigabit speeds), could you think on competting directly with supercomputers? As I see it the Beowulf cluster idea could be based in the distributed computign and the parallel computing: you put more CPUs to get more speedup, but as you can't have all the CPUs in the same machine you use several. So the Beowulf cluster could fit in between the distributed computing and the supercomputers (vetorial computers, parallel computers,..etc). You have advantages from both sides: parallel programming and high scalability; but you also have several drawbacks: mainly interconection problems. Do you think that with 10 Gb conections (OC-192 bandwith), SMP in chip (Power 4) and massive primary and secondary memory devices at low cost, you could have a chance to beat most of the traditional supercomputers? or is not your "goal"? And about the evolution of the Beowulf clusters, do you all follow a kind of guideness or the project have divided in several flavors and objectives? Are the objectives of the beggining the same as today or now you plan to have something like a "super SMP computer" in a distributed way (with good communications times). I've seen that a lot of you are focusing in the GPID and whole machine idea, do you think that is reachable? What are the main objectives vs the MPI/PVM message passing idea? And what about shared memory (in the HD level or the RAM level), do you take advantage of having this amount of resouces? Is this idea trying to reach the objective of making parallel programs "independent" to the programmer? I mean, that instead of having to program having in mind that you are using a parallel machine you can program in a "normal" way and the compiler will divide/distribute the code over the cluster. Is this reachable or just a dream? Is somebody working on this? And what about the administration of a cluster. Having all the machine of the cluster under control, so you can know which are avaliable to send some work, is an hazarous task but necessary. Is not as easy as in a SMP machine where you know or assume that all the CPUs inside are working, in a cluster you can't do that as the CPU might work but the HD, NIC or memory may fail. How much computational time do you spend in this task? There's somebody working in a better way to manage with this? I know that sometime ago HP had a machine woth several faulty processors working and achiving high computational speeds without any error. They used some kind of "control algorithm" that manages to use only the good CPUs. Do you have something like this or there is no point? Does it make sense? That's all for now, thanks to all of you. If you know of some sources where I can get more information, please let me know. Nacho Ruiz.
- Previous message: quick question
- Next message: Beowulf: A theorical approach
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
