[Beowulf] dual-core benefits?
ashley at quadrics.com
Fri Sep 23 08:03:49 PDT 2005
On Fri, 2005-09-23 at 09:43 -0400, Mark Hahn wrote:
> this is pretty rough, of course: if you had the right patterns,
> you could do better or worse. and if you use collectives, you'll
> always be limited at least by inter-node performance.
This isn't strictly true actually, for broadcast it's entirely possible
for the memory bandwidth inside a node to be the bottleneck, assume a
quad CPU node, to broadcast to all four cpus inside a node they all need
to copy the data to their own area, this means four simultaneous
memcpy()s are happening and hence eight memory operations. It doesn't
take much for the network bandwidth to be more than 1/8th the memory
bandwidth. This doesn't apply to ethernet though.
More information about the Beowulf