[Beowulf] memory bandwidth scaling

Bill Broadley bill at cse.ucdavis.edu
Mon Oct 5 21:55:21 PDT 2015


On 10/01/2015 09:27 AM, Orion Poplawski wrote:
> We may be looking a getting a couple new compute nodes.  I'm leery though of
> going too high in processor core counts.  Does anyone have any general
> experiences with performance scaling up to 12 cores per processor with general
> models like CM1/WRF/RAMS on the current crop of Xeon processors?

I've been doing this kind of comparison fairly regularly. My comparison
is usually something along the lines of:

   (cluster cost / number of nodes )
---------------------------------------
( wall clock time of a production run )

That way my "price" includes all the needed infrastructure.

Then I pick the CPU that has the best price/perf ratio. I've been kinda
puzzled by cluster designs that end up with dramatically more expensive
CPUs.  I often end up with the E5-2620 or E5-2630.  Sometimes faster
CPUs are somewhat price/performance neutral, until I put some value on
having more total ram because of the cheaper node prices.

Do people really see better price/performance with E5-2680s?

One nice thing about 64GB nodes with 12 cores/24 threads is you get more
ram per CPU.  For our workloads 2GB/thread (4GB per core) is somewhat
low.  So when I use the E5-2620 with less cores I can often avoid paying
for more then 64GB ram.  Generally I see better scaling within a node
(seeing closer to 6x the single core performance with the E5-2620 than I
see 8x with the E5-2630) as well as better outside the node scaling (the
codes scale better with slower nodes).  Which makes sense since there's
more memory bandwidth per core, more ram per core, and more IB bandwidth
per core.











More information about the Beowulf mailing list