Top 500 trends
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joachim Worringen joachim at ccrl-nece.deWed Nov 27 11:42:42 PST 2002
- Previous message: Top 500 trends
- Next message: Top 500 trends
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mark Hahn: > stream is a piece of source code. how the compiler/runtime actually > implements daxpy is completely free, and certainly does not require a > single address space. therefore, it's quite reasonable to talk about the > Stream score for a loosely coupled cluster. stream is almost the worst > possible kind of code to run on a cluster, though, simply because it has > such a low work:bandwidth ratio. The numbers don't change if you do this because I quoted the per-CPU numbers of a fully-loaded node. > IMO, a benchmark appropriate for SMP would necessarily measure inter-CPU > latency, somehow, and stream does not. I always ignore multiprocessor > stream results, or else look strictly at the scaling of their per-cpu > scores as the machine gets bigger. I don't understand what you mean with "inter-CPU-latency". MPI message latency for intra-node communication is about the same for all SMPs with a decent MPI implementation (a few us) - if it that what you mean. And again: the per-CPU numbers I quoted *are* for fully loaded nodes (8 CPUs on SX-6 node, 2 CPUs on Xeon node). > a "cutting edge chicken" would be a uniprocessor P4/fsb533/dual-PC2700, > delivering (as a guess) a little under 3 GBps/CPU. I would rather measure than guess. I'd be surprised to see a bandwidth increase by a factor three in less than a years time. > > The SX-5 had even higher memory bandwidth, but in turn, the SX-6 is has > > become more cost- and energy-efficient. > > the 3 Gflop chicken would dissipate around 200W; I am guessing the SX-6 > dissipates more than 25/3*200=1.7 KW, no? I compared the cost- and energy-efficiency of SX-5 and SX-6. And you shouldn't mix Gflop with GBps - 3GBps give you at most (!) 3/8Gflop/s. Please don't get me wrong: I don't say everybody should buy vector machines. But it is important to understand that (and why) certain codes run with such a bad efficiency on PC clusters - while they surely do a nice job for many applications and are affordable for many more people than a vector machine. I use them and develop for them as well. Joachim -- Joachim Worringen - NEC C&C research lab St.Augustin fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de
- Previous message: Top 500 trends
- Next message: Top 500 trends
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
