Estimating Cluster Performance

Lindahl, Greg glindahl at
Sat Oct 7 11:36:34 PDT 2000

>     I plan on using the FFTW package from
> <>  and they have compiled some benchmarks at
> <> .   That leads me
> to believe that one box can basically sustain 200 MFLOPs.


The benchmark results you've pointed to are for the previous generation EV56
instead of the EV67's that you'll be buying. The EV67 is often twice as fast
at the same clock as the EV56. However, you're also talking about a
situation where main memory bandwidth is critical. The DS10L sustains about
3x the memory bandwidth of the 4100.

If you want a single-node FFTW run on similar hardware, sign up for the
Compaq "test drive" program.

> Now, I am
> making an assumption that distributing FFTs among nodes is very balanced
> (scaleable?  parallelizable?  whats the term I'm looking for here?) and
> therefore it will scale well

That depends on the details. Are you talking about dividing the data into
chunks, FFTing them separately, and not worrying about the longest
frequencies? If so that's fairly embarrassingly parallel, and will scale

-- greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Beowulf mailing list