[Beowulf] multi-threading vs. MPI

Geoff Jacobs gdjacobs at gmail.com
Sat Dec 8 09:49:41 PST 2007


Donald Shillady wrote:
> This is a very interesting discussion to me.  I have started to purchase
> components for an 8 core microWulf based on the Calvin College microWulf
> constructed by Prof. Joel Adams and his student except I will use
> slightly faster cores with an AMD X2 5400+ in the Master node (dual
> core) and three AMD X2 4000+ dual core processors enclosed in
> inexpensive boxes.  The Master node has an MSI K9N SLI Platinum
> motherboard which has two Gigabit ports so perhaps the initial
> configuration with three satellite dual core CPU can be extended to a
> second set of boxes later.  All these AM2-socket CPU are dual core and
> apparently Prof. Adams was able to address them in the microWulf as
> individual cores but there is, I believe, some hyperthreading between
> the dual cores so what is the story about how the dual cores can be
> addressed individually but still have hyperthreading between the dual
> cores?  I am an experienced programmer for Von Neuman architecture and a
> total novice on parallel systems but as I build the microWulf I wonder
> if MPI will decouple the hyperthreading or is it not there?  From what
> little I have learned so far the microWulf switch depends on the
> relatively slow Gigabit Ethernet so there is probably time within each
> dual core CPU for hyperthreading to occur if indeed provision is
> provided for hyperthreading in the AMD X2 dual cores.  Sorry to ask such
> a dumb question but I am trying to learn.
>  
> Don Shillady
> Emeritus PRofessor of Chemistry, VCU
> Ashland Va (working at home)
> 

I have always programmed in the past with a flat model utilizing MPI.
This has been for dual CPU, single core per CPU computers, but applies
equally to dual core.

Here is how the processes tended to map to physical computers, but it
varied depending on MPI configuration. I also had a lot of fun abusing
the process group file in MPICH1.x.

Computer 1	CPU 1	rank(1)
		CPU 2	rank(2)
Computer 2	CPU 1	rank(3)
		CPU 2	rank(4)
...
...
...

and so forth. Using threads, you could potentially do this:

Computer 1	CPU 1	rank(1)		thread(1)
		CPU 2			thread(2)
Computer 2 	CPU 1	rank(2)		thread(1)
		CPU2			thread(2)
...
...
...

Only the even numbered ranks from the first example are explicitly
utilized by MPI, but each process spawned by MPI creates two threads,
which the operating system on each computer load balances onto each CPU.

<snip>

-- 
Geoffrey D. Jacobs

To have no errors
  would be life without meaning
  No struggle, no joy



More information about the Beowulf mailing list