[Beowulf] More cores/More processors/More nodes?

Sat Sep 30 03:57:58 PDT 2006

On Thu, Sep 28, 2006 at 07:06:14PM +0100, Peter Wainwright wrote:

> What (in your opinion) is the right tradeoff between more cores,
> more processors and more individual compute nodes?

$/performance.

Once you have your code written into pure MPI form, then you can run
on any of the above alternatives. Then you can simply work out
the price for various things, and make a guess at the performance.
Run a few benchmarks to check your guesses.

The general rules work like this:

* The more cores per node, the less performance per core, due to
  imperfect scaling plus generally you only have 1 interconnect
  card/node.
* Note that most interconnects don't scale very well to more
  cores per node, for example the "latency" number everyone
  quotes for interconnects is just 1 core/node. At 4 cores/node
  this number is much worse for most interconnects.
* The more cores per node, the price is often higher per core,
  although this varies. You buy less interconnect, but you pay
  more for fancier processors and motherboards.

We talk about a "sweet spot", that's still (in my opinion) 2 dual-core
cpus per node.

> However, I do not understand what happens when you have
> multi-processor/multi-core nodes in a cluster.  Do you just use MPI
> (with each thread using its own non-shared memory) or is there any
> way to do "mixed-mode" programming which takes advantage of shared
> memory within a node (like, an MPI/OpenMP hybrid?).

The first is the easiest. MPI takes advantage of shared memory within
the node.

The hybrid model is a lot more work for the programmer, and often is
slower than pure MPI. And it hurts interconnect performance because you
usually end up with just 1 core driving the interconnect.

-- greg