[Beowulf] Has anyone actually seen/used a cell system?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Geoff Jacobs gdjacobs at gmail.comSun Oct 1 14:29:10 PDT 2006
- Previous message: [Beowulf] Has anyone actually seen/used a cell system?
- Next message: [Beowulf] Has anyone actually seen/used a cell system?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mark Hahn wrote: >> They have a paper that explains it well and has some >> interesting benchmarks. >> >> http://sc06.supercomputing.org/schedule/pdf/pap225.pdf > > this is quite interesting. I wish they had done benchmarks with doubles, > especially since they alluded to, for instance, the n-body calculation > really needing at least careful consideration of precision/resolution. > (now that I think of it, using 23 bits of mantisas on a 256^3 FFT sounds > numerically dubious too.) > > interesting that for a 2.4GHz Cell, they get at most 10 FP Gflops per SPE. > does anyone have SGEMM numbers for a 3GHz Intel Core2? I'll guess that > efficiency of libgoto with 2 threads would be >= 80%, so flops would be > .8*2*8*3 =~ 40 Gflops, or half a Cell chip. makes it hard to argue for > wide use of Cell, I think... Unfortunately, the reality is a little crappier. Sciencemark 2.0 SGEMM sees 11 gflops on an E6700. DGEMM sees 5-6 gflops. http://www.pcper.com/article.php?aid=265&type=expert&pid=3 This is an order of magnitude less performance than SGEMM predictions in the LBL paper. Unfortunately, the LBL numbers are only predictions. http://www.lbl.gov/Science-Articles/Archive/sabl/2006/Jul/CellProcessorPotential.pdf#search=%22sgemm%20cell%22 The linked article _is_ an evaluation of performance on an actual Cell chip. Unfortunately, it's a lower clocked pre-production example running an experimental pseudo-compiler. I'm interested in seeing SGEMM using Cell-specific intrinsics. Such a benchmark should represent the maximum practical performance peak. Note: even if the Sequoia numbers are approximately the same as SPE intrinsics, cell is still 7x faster than Core2. -- Geoffrey D. Jacobs Go to the Chinese Restaurant, Order the Special
- Previous message: [Beowulf] Has anyone actually seen/used a cell system?
- Next message: [Beowulf] Has anyone actually seen/used a cell system?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
