Andrew Piskorski wrote:
> On Mon, Feb 27, 2006 at 08:49:23PM +0000, Ricardo Reis wrote:

> Do you NEED low latency?  The 8-way box PROBABLY has better latency,
> but perhaps Infinipath HTX adapters between 2-socket nodes would give
> similar (or possibly better in some cases?)  latency - you need to
> compare actual performance numbers.

Looking at this design, it looks like 2 hops to the furthest memory, 
which gives you a latency of something like 100ns + 150ns*(N_hops), 
moreover, some of these hops are over a lower speed HT fabric.

This will be a hard machine to optimize well for at a low level 
(lightweight threads and some load balance issues), but should be fine 
at a higher level (MPI shared memory or some OpenMP).

> Note that that Tyan 8-socket box does not have any extra HTX slots, so
> you if you wanted to cluster multiple such boxes you would have to do
> so over its PCI Express slots or built in Gigabit Ethernet.

If you need huge memories, you can stick up to 16 GB on a normal dual 
processor MB or 32 GB if you are willing to pay outrageous memory 
prices.  You can go to 32 GB using 2 GB dimms on the DK88 board from 
iWill if you don't mind running at DDR/333.  You also don't want to put 
that in a 1 U, the memory generates quite a bit of heat, and you need to 
pull that off.  Not to mention my concerns about electrical stability of 
running 8 memory sockets off a single chip.

If you want to go to an almost reasonably priced 32 GB, you can go to a 
quad system.  If you go dual core on the quad, you can get 8 cores with 
32 GB ram for reasonable prices.  You can get 64 GB ram in that for less 
reasonable pricing.

Going the other direction, if your code simply needs more CPUs (Monte 
Carlo etc) groups of single socket boxes may be most cost effective.


