[Beowulf] dual-core benefits?

Thu Sep 22 14:06:36 PDT 2005

> 1. The scalability of our program is not so good, less then 20 for 32 nodes
> (measured on a single node system). So we don't plan to go beyond 16 nodes.
> (which makes 32 processors due to dual-node usage)

is this scalability assuming a slow interconnect like gigabit?
have you considered when it would be appropriate to go to something fast?
(myrinet, infinipath and quadrics are my favorite, though the latter 
especially is always difficult to squeeze into a budget.)  it's really
excellent if you can characterize your code based on distribution of 
packet sizes, so you can trade off the latency/bandwidth properties of 
various interconnect options.  any recognizable communication patterns
(esp nearest-neighbor) can pay off as well.

> 2. Memory requirement is huge; we will use 4GB memory per node for the time
> being and increase this to 16 GB later. So wee need fast CPUs and efficient
> usage of memory.

these days, that's not huge - after all, 1GB dimms are definitely 
"above the knee" (in the linear region, price-wise.)  what I find is that 
there is (continued) divergence between small and large-memory kinds of 
applications.  people who do MC-type stuff continue to need only a few 
handfuls of MB, whereas memory-intensive apps would like 1000x as much.

> 3. Due to budget limitations we will first configure 8-node system with 4GB
> RAM per node and extend this to a 16-node system with 16-GB of RAM in 6
> months.

the last time I priced systems, 16G per system was starting to bend upwards
in price.  (in fact, the researchers opted for 32G quad-opterons...)

> We were thinking of AMD 250 processors, but now the benchmarks of dual-core
> CPUs (on the web site of AMD) seems encouraging, and the cost of dual-core
> AMD 275 seems to be less then twice of AMD 250. Since the memory cost of our
> system will dominate other costs, we can afford to pass to dual-core
> technology. However, the questions that arise are follows.
> 
> 1. Will it worth? And can we gain any advantages over single-core with the
> not-so-good scalability of our parallel programs? 

that's why you need to figure out why your scalability is poor.  

on a multiprocessor system, you effectively have a pretty fast, if small,
interconnect.  if your code can take advantage of that, then going 
dual-core could well be a win.  for instance, if your code is limited
by short-message, point-to-point latency, then increasing "SMP-ness"
should help a lot, especially if you are assuming mere gigabit.

obviously, if your code scales poorly because it's bottlenecked on memory,
then dual-core is a bad idea.  (actually, if it's bottlenecked on memory
_latency_, that might not necessarily be true...)

> 2. Another question is that is dual-core technology brings any advantages
> for the efficient usage of high amount of memory that we will utilize? 3. 3.

DC doesn't change memory issues: AMD claims that the chips are slightly
more efficient (slightly higher aggregate streaming bandwidth), but it 
seems to be a very small factor.  especially if you compare to e-rev
singlecore chips.  there is a noticable difference vs older revs, especially
with lots of memory, since older chips drop down as low as PC1600
for a sufficient number of memory banks in use (dimm sides, basically.)

> 3. Finally there is something basic that I'm not sure: When we assign a job
> to dual-core CPU, can it divide it between the core-CPUs automatically, or
> should we think dual-core CPU the same as dual-node CPU? If the latter is
> the case, what is the advantage of this technology over dual-node?

it's not automatic - dual-core is just SMP-in-one-package.  the advantage
is mainly that DC lets you amortize the other components in the system.
for programs which are truely limited by memory bandwidth, you really don't 
want to amortize the memory, so DC is a loss in this case.

bear in mind that DC is also lower clock than SC.