[Beowulf] Shared memory
diep at xs4all.nl
Wed Jun 22 18:49:29 PDT 2005
For diep running at a cluster obviously i'm taking advantage of the fact
that a node can have more than 1 processor.
There is a huge difference between the total speedup.
Let's first define the 2 different things:
scaling: "whether all cpu's are searching onto chesspositions
and not losing time too much to communication to other nodes"
speedup: "the TIME needed to complete a search at n processors".
Now n processors in a fixed cluster can be for example:
4 nodes single cpu k7
or 2 nodes dual k7.
Assuming the same network let's assume the speedup difference between the 2:
For example. If you have 2 nodes each node a dual k7,
then if each process would be treated independantly as if it were a 4 node
single cpu k7,
the speedup is perhaps 2.0 (scaling near 100%).
If we treat it like a network of 2 nodes, and start executing DIEP as a
dual processor program at each node, then we speak of a 2 layer
parallellism of course.
The normal parallel search that diep uses uses shared memory and assumes
that messages can get received very quickly and it splits at far smaller
search depths and a single split can be done real real quickly.
To give 1 major advantage of splitting in shared memory at a LOCAL node,
again this can NOT be done over a network as a networks LATENCY is too bad.
One way ping pong latency to read a few bytes in a local processor just
simply comes from the L2 cache of this or the other cpu which is like 13
cycles in case of Opterons, 28-40 cycles or so in case it is a P4 Xeon).
So the actual speedup i get with Diep at a single node is simply 2.0
because the algorithm used there to split is superior using the fast memory
In short the speedup you get there is then 3.0 out of 4.
That's a HUGE difference.
2.0 out of 4 versus 3.0 out of 4 is a MAJOR difference.
That's like a wheelchair versus a formula 1 car (provided we've got
At 08:46 AM 6/22/2005 +0100, Mark Westwood wrote:
>We too have a Beowulf cluster with nodes having 2 CPUs. The memory in
>each node is shared between the 2 CPUs. We use Grid Engine to place
>processes onto processors at run time. Some of our codes require a lot
>of memory for each process, and will be placed one process on a node,
>some of our codes require less memory and will be placed two processes
>to a node. There are different queues defined for the two types of job.
>Our MPI codes are 'processor / node blind' in the sense that there are
>no special features of the code arising directly from the possibility
>that, at run time, two processes executing the program might be sharing
>a node. Users determine, prior to a job's execution, how much memory
>each process in a parallel computation will require, and submit the job
>to the right queue.
>My thinking is that mixed-mode programming, in which a code uses MPI for
>inter-node communication and shared-memory programming (eg OpenMP) for
>intra-node communication, is not worth the effort when you only have 2
>CPUs in each node. In fact, my thinking is that it's not even worth the
>time experimenting to gather evidence. I guess that makes me prejudiced
>against mixed-mode programming on a typical COTS cluster. Now, if you
>were to offer me, say, 8 or 16 CPUs per node, I might think again. Or
>indeed if I am shot down in flames by contributors to this list who have
>done the experiments ...
>One reference which might be of interest:
>Jeff zhang wrote:
>> Each node of my Beowulf cluster has two CPUs. The memory is shared
>> between the two CPUs. How is MPI handling the memory in this
>> situation? What is the most efficient way to program under this
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
>The Technology Centre
>Offshore Technology Park
>+44 (0)870 429 6586
>As usual in such postings, the views expressed in this email are
>personal and are not intended to represent the views of OHM Ltd.
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf