[Beowulf] Barcelona numbers

Mon Sep 10 22:17:52 PDT 2007

On Tue, Sep 11, 2007 at 04:32:37AM +0000, richard.walsh at comcast.net wrote:

> Yes, yes ... ;-) ... I like the stream numbers too, but I also like to know how much of
> the advertized capacity of memory bus one can get.

It's like saying your car's peak is 300 mph, when your car tires are
good for 300 mph, but your engine is only good for 200.

In fact the memory bus can never achieve that number, even with a
perfect memory controller, and memory controllers are not perfect.

> > You're misremembering. Opteron latency was always a function of the
> > number of active sockets, and it is usually measured with only one
> > core active, while Bill is doing the more realistic thing of having
> > all the cores active. Run the same code on your favorite Intel if you
> > want to compare.
> 
> Granted, latency measures depend on the nearness of the memory
> referenced

No, on Opteron it doesn't. The *bandwidth* depends on nearness, the
*latency* pretty much depends on the last snoop coming back from the
farthest socket. That's why, even when you're accessing only local
memory, the latency gets worse as you add sockets. But the total
stream bandwidth rises as you add more sockets.

On systems with directory-based SMP protocols, things are different.
That's probably what you're used to seeing -- SGI Origin, for example.

-- greg