[Beowulf] Has anyone actually seen/used a cell system?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at physics.mcmaster.caSun Oct 1 20:24:13 PDT 2006
- Previous message: [Beowulf] Has anyone actually seen/used a cell system?
- Next message: [Beowulf] Has anyone actually seen/used a cell system?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>> The same site reports that the X6800, a 2.93 GHz Core2 and sees >> almost 12.5 SP GFLOPS using ScienceMark 2.0 (6.2 DP GFLOPS). hmm, those numbers are pretty low - peak should be 2.93*4 or 8, and I'd expect 80% of peak or 19 Gflops/core for this comparison (Opterons can do 90%, at least on my machine using HPL.) so the paper shows 80.6 Gflops SGEMM for 8 SPE's; it's only fair to compare this to 2 or 4 Core2 cores (37.5 and 75 Gflops!) > indicative of per core performance on Core 2. Is it safe to say that > Core 2 achieves <15 gflops/core at 3ghz, assuming ~15% premium with Goto > BLAS? peak SGEMM/core would be 3*8=24, so 15 sounds quite low. >> It looks like a preproduction 2.4 GHz Cell is 2-6 times faster than a do you know of something crippled in the pre-production Cell chips? it looks like 2x is about right to me, considering that full-production Cell appears to ship about the same time as 4x Core2. the main question is whether that's good enough to make Cell more than a niche product. I've talked with a number of my better users, and they all tend to want >=10x speedup before considering non-GP approaches (cell, fpga, gpgpu). > I guess my biggest objection to Mark's comment was the comparison of > SGEMM implemented in an experimental language with unproven structure > with a theoretical calculation of Core 2 peak performance. I'd simply I don't think there's anything too dubious about 80% of theoretical for Core2. but I also didn't think the Sequoia stuff was such a cheap hack as you imply (not to put words into your mouth ;) > like to see a benchmark comparison of SGEMM (and DGEMM) using Core > 2-optimized BLAS vs. Cell-optimized BLAS, thereby making a useful > conclusion about how interesting Cell is for HPC. actually, Sequoia seems precisely like the structure you need to make Cell work, since it's whole purpose is to express the rather constrained way that memory is used in Cell. the paper is actually pretty clear on where the Cell spends its time, and for SGEMM, it's executing the "leaf" code, which is IBM's Cell library. I guess the prototype might be really bad, or Sequoia might be broken in a way not hinted in the paper, or IBM's Cell intrinsic library could be terrible. but the paper seems on the up-and-up, and the scaling curves and leave-vs-communication figures surely make Cell look underwhelming, at least if you assume, as I do, that it has to deliver a large speedup to be worth investing in... regards, mark hahn.
- Previous message: [Beowulf] Has anyone actually seen/used a cell system?
- Next message: [Beowulf] Has anyone actually seen/used a cell system?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
