Beowulfs can compete with Supercomputers [was Beowulf: A theorical approach]
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Christoph Best c.best at fz-juelich.deFri Jun 23 06:50:03 PDT 2000
- Previous message: Beowulfs can compete with Supercomputers [was Beowulf: A theorical approach]
- Next message: Beowulf: A theorical approach
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, just my two cents (of an Euro) to the "Beowulf vs. Supercomputer" discussion: I found that the comparison is more often "building/buying your own group or departmental cluster" vs. "writing applications for supercomputer time on a nationwide computer center". Even our little 12-processor cluster provides 100000 processor hrs a year, about what you would get for a smaller project in a supercomputing center, and the 128-processor ALiCE cluster of Wuppertal University may be a factor 5-10 smaller than a big Cray, but there are usually much more than 10 research institutions sharing a supercomputing center. [BTW, the Wuppertal cluster was chosen over established mid-range supercomputers in a competition based on price/performance for selected application benchmarks.] Add to this the organizational overhead and inconvenience of a supercomputing center. So unless you really need O(1024) processors, many projects should be better off on a cluster. And if you really need that amount of computer time for a prolonged period, you probably would not be able to pay for the supercomputer. Some subfields, like ours (Lattice Field Theory) or astrophysics, have since quite some time resorted to building their own supercomputers, sometimes combining the Beowulf idea of off-the-shelf components with custom interconnects. The closest may be QCDSP from Columbia University, which is built from Texas Instrument Digital Signal Processors on custom printed-circuit boards, and delivers in its largest installation about 400 GFlops (they are aiming at 10 TFlops for their next project). Others are QCD-PACS in Japan (based on a modified HP chip), and APE in Italy/Germany (custom designed processors for a single-instruction multiple data machine). Also, when writing an application that needs O(100) GFlops-years, many physics groups are happy to tailor their programs to the machine and write message passing codes (as long as graduate students come cheap), so SMP is not really missed. Cray's top-of-the-line T3E actually is message-passing, so many programs are written for it. Finally, we found that processor speed is increasing so quickly, that even our once considered network-hungry application does not exhaust Myrinet. Myrinet gives you maybe 100 MB/s data transfer, but the memory transfer rate may also be only 300-500 MB/s/proc. - a 10 GB/s network would run much faster than a current memory bus. Actually, the best argument, if any, against Beowulves that I found was sheer size and power consumption, mainly because the average node contains much more circuitry than needed. If someone came up with a small board containing an Alpha processor, cache and main memory, and a Myrinet or similar connection... But this, of course, would not be much different from a T3E. -Chris -- Christoph Best c.best at computer.org John von Neumann Institute for Computing/DESY http://www.oche.de/~cbest
- Previous message: Beowulfs can compete with Supercomputers [was Beowulf: A theorical approach]
- Next message: Beowulf: A theorical approach
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
